Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-20 Thread Ron Mayer
Josh Berkus wrote:
 With the current patches, the data survives a restart just fine.
 
 Per -hackers, that's not guarenteed.

Not guaranteed is fine. What people are asking for is often survives.

AFAIK we don't truncate the log file created by the log_filename GUC
on every unclean crash and every clean shutdown.

Should we? :-)

Why not?

For people who intend to use these tables to log application data,
they'd have the exact same reasons for not wanting them truncated
when they don't need to be.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-20 Thread Josh Berkus



AFAIK we don't truncate the log file created by the log_filename GUC
on every unclean crash and every clean shutdown.


That's not a remotely relevant analogy.  A log file is not a database table.

If we allow a database table to become corrupted due to being unsynched 
at the time of shutdown, it's not a matter of missing a few rows.  The 
table is *unreadable*, and may cause the backend or even the whole 
server to crash when you try to read it.


Anyway, per discussion on hackers, unlogged tables (or volatile tables 
as they're now being called) include two modes in the spec; one which 
checkpoints (and thus can survive a planned restart) and one which 
doesn't (and will truncate on every restart, but doesn't cause physical 
I/O).  We may or may not have both modes for 9.1.



--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-19 Thread Jayadevan M
Hello,

  Thank you for the reply. But my doubt was not about layout, rather the 

  DMLs. If I do an insert into an 'unlogged' table, what happens to 
that? 
  Will that be replicated in the slave (using PostgreSQL's inbuilt 
  replication)?
 
 What are the use-cases for replicating unlogged tables?
 
I do not have a use case for replicating unlogged tables. But I may use 
temp/unlogged tables in my master to populate actual tables. 

Say, I have a db archival/purge process. I populate temp tables with PKs 
of my permanent tables and use that to drive my insertion into history 
tables, deletion from live tables etc. 

Pseudocode

Insert into mytemptable (id)  tables select mypk from liveable where 
lastuseddate  archivedate;

insert into myhist select a.* from livetable  a join mytemp table on 
a.mypk=mytemptable.id
delete from liveable where mypk in (select id from mytemp )

Reading about what goes into WAL tells me that the permanent table data 
will be replicated all right even if the temp tables are not logged. Is 
that right? 

Regards,
Jayadevan





DISCLAIMER: 

The information in this e-mail and any attachment is intended only for 
the person to whom it is addressed and may contain confidential and/or 
privileged material. If you have received this e-mail in error, kindly 
contact the sender and destroy all copies of the original communication. 
IBS makes no warranty, express or implied, nor guarantees the accuracy, 
adequacy or completeness of the information contained in this email or any 
attachment and is not liable for any errors, defects, omissions, viruses 
or for resultant loss or damage, if any, direct or indirect.






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-19 Thread Ron Mayer
Glen Parker wrote:
 As was already mentioned, application logs.  Unlogged tables would be
 perfect for that, provided they don't go *poof* every now and then for
 no good reason.  Nobody's going to be too heart broken if a handful of
 log records go missing, or get garbled, after a server crash or power
 outage.  Delete 'em all after every restart though, and that's a problem.

How often are you doing unintentional restarts?

I'd guess for many people it's whenever I had so many backend crashes
that I get motivated to check if I'm running the latest minor release.

And if it's an intentional restart - surely you could archive your
application logs before doing the restart, no?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-18 Thread Karsten Hilbert
On Thu, Nov 18, 2010 at 08:49:12AM +0100, Alban Hertroys wrote:

 From the discussion so far it appears to me that
 unlogged should probably be split into various gradations
 of unlogged. There appear to be a number of popular
 use-cases for such tables, with different requirements,

That's precisely the point why this discussion doesn't lead
to a *solution*. It can only lead to a *decision*.

It seems that it needs to be decided first whether in the
case of unWALed tables we want PostgreSQL to provide *means*
or *policies*. The former are decidable and robustly
implementable in a piece of infrastructure software like
PostgreSQL. The latter are up to the whims of each
deployment site.

 Which leads me to think that people want three knobs to
 play with: should the table be logged or not? Can it be
 truncated at normal server restart or not? Should it be
 included in dumps or not? And possibly, should it be fsynced
 or not?

Yep, your analysis breaks down the policy stage (the grading
of logged) into modes or means which people can apply
to achieve a certain policies.

That is why I argued for options:

- alter database dump_unlogged_tables to on/off

default on: better safe than sorry, point the gun but don't pull the 
trigger

- pg_dump --include-unlogged-tables

default: whatever alter database says

- psqlrc:  \set include_unlogged_tables to on/off

default: doesn't exist, falls back to what alter database or 
--include-unlogged say

That way I can use certain means to work out the policy I
want, namely setting alter database to what it should be
on this database waaay before the time comes when it is
crucial to not forget --included-unlogged.

Karsten
-- 
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-18 Thread Karsten Hilbert
On Thu, Nov 18, 2010 at 10:30:46AM +0100, Karsten Hilbert wrote:

 That is why I argued for options:
 
 - alter database dump_unlogged_tables to on/off
 
   default on: better safe than sorry, point the gun but don't pull the 
 trigger

(I agree, however, that the database metadata isn't really
the appropriate place to store application specific 
configuration items. So, maybe pgdumprc is a better place
for that.)

Karsten
-- 
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-18 Thread Jayadevan M
Hello,

  PostgreSQL 9.1 is likely to have, as a feature, the ability to create
  tables which are unlogged, meaning that they are not added to the
  transaction log, and will be truncated (emptied) on database restart.
  Such tables are intended for highly volatile, but not very valuable,
  data, such as session statues, application logs, etc.
  

One doubt - if the tables are 'unlogged' , will the DMLs against these 
still be replicated to a slave? 

Regards,
Jayadevan





DISCLAIMER: 

The information in this e-mail and any attachment is intended only for 
the person to whom it is addressed and may contain confidential and/or 
privileged material. If you have received this e-mail in error, kindly 
contact the sender and destroy all copies of the original communication. 
IBS makes no warranty, express or implied, nor guarantees the accuracy, 
adequacy or completeness of the information contained in this email or any 
attachment and is not liable for any errors, defects, omissions, viruses 
or for resultant loss or damage, if any, direct or indirect.






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-18 Thread Andy Colson

On 11/18/2010 3:46 AM, Jayadevan M wrote:

Hello,


PostgreSQL 9.1 is likely to have, as a feature, the ability to create
tables which are unlogged, meaning that they are not added to the
transaction log, and will be truncated (emptied) on database restart.
Such tables are intended for highly volatile, but not very valuable,
data, such as session statues, application logs, etc.



One doubt - if the tables are 'unlogged' , will the DMLs against these
still be replicated to a slave?

Regards,
Jayadevan


Yes, because the system tables, which store the layout of all tables, is 
written to WAL.


-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-18 Thread Josh Berkus



Does that sum it up adequately?


One more thing: that you might not get all of these options in 9.1. 
Currently the discussion is talking about *maybe* offering checkpointing 
of unlogged tables, which would allow such tables to survive a normal 
restart, and including unlogged tables in pg_dump by default.  But it's 
also possible that we'll only have one type of unlogged table in 9.1., 
with other options waiting for 9.2.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-18 Thread Jayadevan M
Hi,

  One doubt - if the tables are 'unlogged' , will the DMLs against these
  still be replicated to a slave?
 
 
 Yes, because the system tables, which store the layout of all tables, is 

 written to WAL.
Thank you for the reply. But my doubt was not about layout, rather the 
DMLs. If I do an insert into an 'unlogged' table, what happens to that? 
Will that be replicated in the slave (using PostgreSQL's inbuilt 
replication)?
Regards,
Jayadevan





DISCLAIMER: 

The information in this e-mail and any attachment is intended only for 
the person to whom it is addressed and may contain confidential and/or 
privileged material. If you have received this e-mail in error, kindly 
contact the sender and destroy all copies of the original communication. 
IBS makes no warranty, express or implied, nor guarantees the accuracy, 
adequacy or completeness of the information contained in this email or any 
attachment and is not liable for any errors, defects, omissions, viruses 
or for resultant loss or damage, if any, direct or indirect.






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-18 Thread Alban Hertroys
On 19 Nov 2010, at 4:23, Jayadevan M wrote:

 Hi,
 
 One doubt - if the tables are 'unlogged' , will the DMLs against these
 still be replicated to a slave?
 
 
 Yes, because the system tables, which store the layout of all tables, is 
 
 written to WAL.
 Thank you for the reply. But my doubt was not about layout, rather the 
 DMLs. If I do an insert into an 'unlogged' table, what happens to that? 
 Will that be replicated in the slave (using PostgreSQL's inbuilt 
 replication)?

What are the use-cases for replicating unlogged tables?

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4ce6246e10421025920086!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-18 Thread Marc Mamin
 What are the use-cases for replicating unlogged tables?

Hello,

I guess they could be useful in cloud infrastructures.

(see http://archives.postgresql.org/pgsql-general/2010-11/msg00865.php)

regards,

Marc Mamin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Karsten Hilbert
On Tue, Nov 16, 2010 at 10:25:13PM -0500, Tom Lane wrote:

 4. The last bit of discussion on -hackers concerned what to do in
 the case where the server got shut down cleanly.  If it was shut
 down cleanly, then any data for unlogged tables would have been
 written out from shared buffers ... but did the data make it to disk?
 There's no easy way to know that.  In the event of an OS crash or
 power failure shortly after server shutdown, it's possible that
 the unlogged tables would be corrupt.

Aaah, indeed.

 So Robert's initial proposal
 includes truncating unlogged tables at any database startup, even
 if the previous shutdown was clean.

Sounds reasonable.

 Some (including me) are arguing
 that that is unnecessarily strict; but you do have to realize that
 you're taking some risk with data validity

Don't. We've always liked PostgreSQL for that. Or at least
let us point the gun at our feet ourselves (such as with
fsync).

Karsten
-- 
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Yeb Havinga

On 2010-11-17 02:55, Josh Berkus wrote:

If you do wish to have the data tossed out for no good reason every so
often, then there ought to be a separate attribute to control that.  I'm
really having trouble seeing how such behavior would be desirable enough
to ever have the server do it for you, on its terms rather than yours.

I don't quite follow you.  The purpose of unlogged tables is for data
which is disposable in the event of downtime; the classic example is the
a user_session_status table.
That sounds an awful lot like temporary tables. Perhaps the biggest 
problem of unlogged tables is that it doesn't connote truncate at 
restart. With the truncate an unlogged table is more like a 'cluster 
temporary table'. While this is a very ugly name, I wonder if an DBA 
would expect a cluster temporary table to be backed up by default.


I just filled in the questionaire, and to my surprise I agreed more with 
the 'don't backup by default' question. The reason is that because the 
question also said: because it contains disposable data. Maybe a better 
question would have been: would you expect pg_dump to backup unlogged 
tables, at the point that you didn't more about them than that they are 
not written to the WAL? In that case I'd say: yes.


regards,
Yeb Havinga


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Derrick Rice
On Wed, Nov 17, 2010 at 8:20 AM, Yeb Havinga yebhavi...@gmail.com wrote:

 That sounds an awful lot like temporary tables.


A lot like a GLOBAL temporary table, which isn't currently supported.

Is there a difference between a global temporary table (if such a thing
existed in PostgreSQL) and an unlogged table?

Derrick


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Scott Mead
On Tue, Nov 16, 2010 at 10:25 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Man, the number of misunderstandings in this thread is staggering.
 Let me try to explain what the proposed feature will and will not do.

 1. The system catalog entries for all tables will be wal-logged.
 So schema (DDL) will survive a crash.  There wouldn't be any way
 to make it not do that, because we can't wal-log only some updates
 to a particular table, and that includes the catalogs in particular.

 Gotcha


 2. What's proposed as the new feature is that specific non-system
 tables can be marked as unlogged, meaning that WAL entries won't
 be made for changes in those tables' contents (nor their indexes'
 contents).  So we can't guarantee that the contents of such tables
 will be correct or consistent after a crash.  The proposed feature
 deals with this by forcibly truncating all such tables after a crash,
 thus ensuring that they're consistent though not populated.  So the
 possible use-cases for such tables are limited to where (a) you can
 repopulate the tables on demand, or (b) you don't really care about
 losing data on a crash.


I would rather be allowed to decide that for myself.



 3. There's a lot of wishful thinking here about what constitutes a
 crash.  A backend crash *is* a crash, even if the postmaster keeps
 going.  Data that had been in shared buffers doesn't get written out
 in such a scenario (and if we tried, it might be corrupt anyway).  So
 unlogged tables would be corrupt and in need of truncation after such an
 event.  Obviously, the same goes for an OS-level crash or power failure.


Right, just let *me* decide, that's all.



 4. The last bit of discussion on -hackers concerned what to do in
 the case where the server got shut down cleanly.  If it was shut
 down cleanly, then any data for unlogged tables would have been
 written out from shared buffers ... but did the data make it to disk?
 There's no easy way to know that.  In the event of an OS crash or
 power failure shortly after server shutdown, it's possible that
 the unlogged tables would be corrupt.  So Robert's initial proposal
 includes truncating unlogged tables at any database startup, even
 if the previous shutdown was clean.  Some (including me) are arguing
 that that is unnecessarily strict; but you do have to realize that
 you're taking some risk with data validity if it doesn't do that.


It is too strict, it makes the feature barely more usable than a temp table.
As a DBA, I realize the implication of the feature:
   *) b0rked indexes
   *) b0rked data
   *) Not knowing what's good and what's bad
   *) Bad reports
   *) Bad Bi

etc..., etc... etc...

   Still, I'd rather be allowed to make the decision here.  I think that
having the database try to enforce integrity on something i've marked as
'corruptable' (via the 'unlogged' flag) will be a constant fight between me
and the system.  In the end, I'd just not use the feature.


 The bottom line here is that you really can only use the feature
 for data that you're willing to accept losing on no notice.
 Allowing the data to persist across clean shutdowns would probably
 improve usability a bit, but it's not changing that fundamental fact.


Agreed, and that's fine.  IMHO, it improves the usability 10 fold.  Having
it truncated on server restart is useful for only a fraction of the
use-cases for this feature.

--Scott



regards, tom lane



Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Scott Mead
On Wed, Nov 17, 2010 at 10:38 AM, Scott Mead sc...@scottrmead.com wrote:

 On Tue, Nov 16, 2010 at 10:25 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Man, the number of misunderstandings in this thread is staggering.
 Let me try to explain what the proposed feature will and will not do.

 1. The system catalog entries for all tables will be wal-logged.
 So schema (DDL) will survive a crash.  There wouldn't be any way
 to make it not do that, because we can't wal-log only some updates
 to a particular table, and that includes the catalogs in particular.

 Gotcha


 2. What's proposed as the new feature is that specific non-system
 tables can be marked as unlogged, meaning that WAL entries won't
 be made for changes in those tables' contents (nor their indexes'
 contents).  So we can't guarantee that the contents of such tables
 will be correct or consistent after a crash.  The proposed feature
 deals with this by forcibly truncating all such tables after a crash,
 thus ensuring that they're consistent though not populated.  So the
 possible use-cases for such tables are limited to where (a) you can
 repopulate the tables on demand, or (b) you don't really care about
 losing data on a crash.


 I would rather be allowed to decide that for myself.



 3. There's a lot of wishful thinking here about what constitutes a
 crash.  A backend crash *is* a crash, even if the postmaster keeps
 going.  Data that had been in shared buffers doesn't get written out
 in such a scenario (and if we tried, it might be corrupt anyway).  So
 unlogged tables would be corrupt and in need of truncation after such an
 event.  Obviously, the same goes for an OS-level crash or power failure.


 Right, just let *me* decide, that's all.



 4. The last bit of discussion on -hackers concerned what to do in
 the case where the server got shut down cleanly.  If it was shut
 down cleanly, then any data for unlogged tables would have been
 written out from shared buffers ... but did the data make it to disk?
 There's no easy way to know that.  In the event of an OS crash or
 power failure shortly after server shutdown, it's possible that
 the unlogged tables would be corrupt.  So Robert's initial proposal
 includes truncating unlogged tables at any database startup, even
 if the previous shutdown was clean.  Some (including me) are arguing
 that that is unnecessarily strict; but you do have to realize that
 you're taking some risk with data validity if it doesn't do that.


 It is too strict, it makes the feature barely more usable than a temp
 table.
 As a DBA, I realize the implication of the feature:
*) b0rked indexes
*) b0rked data
*) Not knowing what's good and what's bad
*) Bad reports
*) Bad Bi

 etc..., etc... etc...

Still, I'd rather be allowed to make the decision here.  I think that
 having the database try to enforce integrity on something i've marked as
 'corruptable' (via the 'unlogged' flag) will be a constant fight between me
 and the system.  In the end, I'd just not use the feature.


 The bottom line here is that you really can only use the feature
 for data that you're willing to accept losing on no notice.
 Allowing the data to persist across clean shutdowns would probably
 improve usability a bit, but it's not changing that fundamental fact.


 Agreed, and that's fine.  IMHO, it improves the usability 10 fold.  Having
 it truncated on server restart is useful for only a fraction of the
 use-cases for this feature.



Now that I've just sent that last piece, what about a 'truncate on restart'
option that is defaulted to on?  That way, the community feels good knowing
that we're trying to protect people from themselves, but like the 'fsync'
feature, I can load the gun and pull the trigger if I really want to.  I'd
like to see that so even if there is a server crash, it doesn't truncate.
That way, i can rename the garbage table if I want, create a new one for all
new data and then be allowed to glean what I can from the last one.

--Scott



 --Scott



regards, tom lane





Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Tom Lane
Derrick Rice derrick.r...@gmail.com writes:
 Is there a difference between a global temporary table (if such a thing
 existed in PostgreSQL) and an unlogged table?

Yes --- IIRC, a global temp table per spec has session-local contents.
An unlogged table acts just like any other table except with respect to
crash safety.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Steve Crawford

On 11/16/2010 07:25 PM, Tom Lane wrote:

Man, the number of misunderstandings in this thread is staggering
   


First, I have plenty of processes that I would immediately convert to 
using this (and, FWIW,  none of them would benefit from preserving data 
across restarts). But I have some questions that may expose my 
misunderstandings:


1. Would there be restrictions preventing a standard table from having a 
FK or other constraint that depends on an unlogged table? If not, it 
seems like there could be an unwanted ripple-effect from lost of the 
unlogged table.


2. Would it be possible to accidentally mix logged and unlogged tables 
in an inheritance chain? What would be the impact?


3. If unlogged data is included in a dump (my vote is no), would this 
lead to inconsistent behavior between dumps taken from a master and 
dumps taken from a hot-standby?


4. Would it be reasonable for temporary-tables to be unlogged by default?

Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Tom Lane
Steve Crawford scrawf...@pinpointresearch.com writes:
 1. Would there be restrictions preventing a standard table from having a 
 FK or other constraint that depends on an unlogged table? If not, it 
 seems like there could be an unwanted ripple-effect from lost of the 
 unlogged table.

I would assume that we should disallow an FK referencing an unlogged
table from a regular table.  I don't know whether the current patch
covers that point, but if not it's an oversight.

 2. Would it be possible to accidentally mix logged and unlogged tables 
 in an inheritance chain? What would be the impact?

I don't see any logical problem there: some of the data in the
inheritance tree would be subject to loss on crash, other data not.
But the schema is all logged so no inconsistency arises.

 3. If unlogged data is included in a dump (my vote is no), would this 
 lead to inconsistent behavior between dumps taken from a master and 
 dumps taken from a hot-standby?

Hmm, that is a really interesting point.  You're right that a dump taken
from a standby slave could not possibly include such data, since for
lack of WAL it would never be propagated to the slave.  I am not sure
whether that inconsistency is a sufficiently good reason to not dump the
data from the master, though.  I think that in any case we are going to
want a pg_dump option to dump/not dump unlogged data --- the argument is
only about which behavior will be default.

I'm not sure that Robert's completely thought through the implications
of this patch for behavior on a slave, anyway.  Given the sequence
* take base backup from running system (which will surely
  include inconsistent data for unlogged tables)
* install base backup on slave
* run recovery, transitioning to hot standby
* go live
it's unclear to me where along the line the slave has an opportunity to
clean out its bogus images of the unlogged tables.  But it had better do
so before opening up for hot standby queries, let alone going live.

 4. Would it be reasonable for temporary-tables to be unlogged by default?

Temp tables already are, always have been, always will be, unlogged.
This patch is about attempting to bring that performance benefit of
a temp table to regular tables.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Josh Berkus



As was already mentioned, application logs. Unlogged tables would be
perfect for that, provided they don't go *poof* every now and then for
no good reason. Nobody's going to be too heart broken if a handful of
log records go missing, or get garbled, after a server crash or power
outage. Delete 'em all after every restart though, and that's a problem.


That's a nice thought, but it's not how data corruption works in the 
event of a crash.  If a table is corrupted, *we don't know* how it's 
corrupted, and it's not just the last few records which are corrupted. 
 So for unlogged tables, there is never going to be any other option 
for crashes than to truncate them.


Robert Haas did discuss the ability to synch unlogged tables on a 
planned shutdown, though.   However, that's liable to wait until 9.2, 
given the multiple steps required to make it work.


Note that you would have the option of periodically synching an unlogged 
table to pgdump or to a logged table, via script, if you cared about 
retaining the data.  That would probably give you the behavior you want, 
above.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Glen Parker

On 11/16/2010 07:25 PM, Tom Lane wrote:

2. What's proposed as the new feature is that specific non-system
tables can be marked as unlogged, meaning that WAL entries won't
be made for changes in those tables' contents (nor their indexes'
contents).  So we can't guarantee that the contents of such tables
will be correct or consistent after a crash.  The proposed feature
deals with this by forcibly truncating all such tables after a crash,
thus ensuring that they're consistent though not populated.  So the
possible use-cases for such tables are limited to where (a) you can
repopulate the tables on demand, or (b) you don't really care about
losing data on a crash.



Well if you guys would just hurry up and implement a way to mark indexes 
as inconsistent and continue to run without using them, you'd at least 
have the index problem solved :D


I was one of the guys drooling over WAL way back when it was a new 
feature.  I understand the risks in not logging updates, and most of the 
time won't accept the risk if I don't absolutely have to.  But, OTOH, 
before WAL, ALL tables were unlogged, and we still never lost any data 
that I'm aware of.  I don't remember ever having anything worse than 
errors from corrupt indexes.


Maybe there's some old code somewhere to repair PG tables that could be 
dusted off and updated...?


Heck, even if the postmaster refused to do anything with tables it 
thought might be FUBAR, and we had to repair them to even issue selects 
against them, in some cases that would still be better than having all 
the data go *poof*.


-Glen


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Scott Mead
On Wed, Nov 17, 2010 at 12:49 PM, Josh Berkus j...@agliodbs.com wrote:


  As was already mentioned, application logs. Unlogged tables would be
 perfect for that, provided they don't go *poof* every now and then for
 no good reason. Nobody's going to be too heart broken if a handful of
 log records go missing, or get garbled, after a server crash or power
 outage. Delete 'em all after every restart though, and that's a problem.


 That's a nice thought, but it's not how data corruption works in the event
 of a crash.  If a table is corrupted, *we don't know* how it's corrupted,
 and it's not just the last few records which are corrupted.  So for
 unlogged tables, there is never going to be any other option for crashes
 than to truncate them.

 Robert Haas did discuss the ability to synch unlogged tables on a planned
 shutdown, though.   However, that's liable to wait until 9.2, given the
 multiple steps required to make it work.

 Note that you would have the option of periodically synching an unlogged
 table to pgdump or to a logged table, via script, if you cared about
 retaining the data.  That would probably give you the behavior you want,
 above.


In an airplane, a pilot can kill the engine mid-flight if [s]he wants to.
They can deploy the flaps /slats at cruise speed / altitude, and if they're
so minded, they can land with a full tank of gas.  Now, none of these things
are particularly wise, but that's why the pilots are given *slightly* more
learning than your average bus driver.

  If you want to have a widely usable 'unlogged' table feature, I highly
recommend that 'truncate on server crash/restart' be an option that is
defaulted to true.  That way, I can go in an push the buttons I want and
give corrupted data to whomever, whenever i like.  (Land with a full tank of
Jet-A).

Whatever the decision is about backup, doesn't really matter IMO, but I
honestly think that the benefit of an unlogged table is there for both
session data (I run my session db's in fsync mode anyway and re-initdb them
on boot) AND for logging data where I can't take WAL anymore, but would like
to be able to have them in the same cluster as other stuff.  If they just
disappear then this feature won't be useful [to me] and I'll have to either
wait for the patch or give up on it and do a flat-file / lucene project just
to deal with it (I really don't want to do that :-).

--Scott




 --
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Marc Mamin
I would like to choose the table behaviour on restart (restore/forget it)


Currently, I'm looking for a way to split large transaction on different 
threads (with dblink).
AN issue is to efficiently share temp data across the threads. unlogged tables 
would be here fine, something like global temp tables with shared data.

here an example to illustrate the current situation:


select cic_connect_me('c');
select dblink_exec   ('c', 'drop table if exists my_share');
select dblink_exec   ('c', 'create table my_share( a int)');
select dblink_disconnect ('c');

SELECT cic_multithread(ARRAY[
  'insert into my_share select * from generate_series(1,1)',
   'insert into my_share select * from generate_series(1,1)',
   'insert into my_share select * from generate_series(1,1)',
   'insert into my_share select * from generate_series(1,1)',
   'insert into my_share select * from generate_series(1,1)',
  'insert into my_share select * from generate_series(1,1)']
,max_threads=4);

create temp table my_result as select * from my_share;
drop table my_share;

select * from my_result;

For pg dump, I guess that having an optional flag is fine, but:
unlogged tables could also be useful to store very large 'raw' data to be 
processed,
whereas the client would only query the processed results.
In such a case, restoring the logged table has a higher priority.
The best solution in my opinion, would allow to dump/restore these 2 table 
types in separate processes (or threads..).

(and by the way: would it be possible to choose the compress tool as an option 
for pg_dump)

pgdump -F.. -Compress pigz -f out.dmp -f_unlogged out_unlogged.dmp.

regards, 

Marc Mamin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Marc Mamin
Hello, 
another question.
I haven't read the complete threads, so I apologize if this was already 
discussed.

Will it be possible to switch from unlogged to logged ?

To improve COPY performances, I currently: 

- make a copy of heavily indexed tables
- load new data in the shadow table
- add the indexes
- drop the live table
- rename the shadow table to the visible one.

Is it imaginable to use unlogged tables foe the shadow one and then enable 
logging after the switch ?

regards,

Marc Mamin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Tom Lane
Marc Mamin m.ma...@intershop.de writes:
 Will it be possible to switch from unlogged to logged ?

Probably not, because it would completely confuse hot standby slaves
(or anything else looking at the WAL replay stream).  You can't just
start issuing WAL records against an already-built table or index,
because the WAL entries are deltas.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Marc Mamin
yep,

but I'll miss this as I only use WAL for crash recovery...

regards,

Marc Mamin

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Mittwoch, 17. November 2010 23:40
To: Marc Mamin
Cc: PostgreSQL general; Josh Berkus
Subject: Re: [GENERAL] Survey on backing up unlogged tables: help us
with PostgreSQL development! 

Marc Mamin m.ma...@intershop.de writes:
 Will it be possible to switch from unlogged to logged ?

Probably not, because it would completely confuse hot standby slaves
(or anything else looking at the WAL replay stream).  You can't just
start issuing WAL records against an already-built table or index,
because the WAL entries are deltas.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Merlin Moncure
On Tue, Nov 16, 2010 at 5:46 PM, Josh Berkus j...@agliodbs.com wrote:
 Folks,

 Please help us resolve a discussion on -hackers.

 PostgreSQL 9.1 is likely to have, as a feature, the ability to create
 tables which are unlogged, meaning that they are not added to the
 transaction log, and will be truncated (emptied) on database restart.
 Such tables are intended for highly volatile, but not very valuable,
 data, such as session statues, application logs, etc.

 The question is, how would you, as a DBA, expect pg_dump backups to
 treat unlogged tables? Backing them up by default has the potential to
 both cause performance drag on the unlogged table and make your backups
 take longer unless you remember to omit them. Not backing them up by
 default has the drawback that if you forget --include-unlogged switch,
 and shut the database down, any unlogged data is gone. How would you
 *expect* unlogged tables to behave?

For 'as regular table' argument:
We are *assuming* the data is not very valuable.  I'd rather assume it
is valuable.  pg_dump has a very specific purpose: to back up the
database in it's current state into a file that can replay that state.
 Not backing up large objects by default is a huge gotcha imnsho.  I
understand the reasoning, but disagree with it.  Certainly a switch to
tune them out would be nice, but not the default.

Against argument:
hm, I guess the counter agrument is that since because they are not
WAL logged, they can't possibly be replayed to a standby, and it makes
sense to have pg_dump and archive log based backup behave in similar
fashion.  It's weird that different backup strategies produce
different results.

I think the 'against argument' is stronger by about .5, so I'm voting
.5 for pg_dump not to dump them.  In fact, if that side of it wins,
maybe pg_dump shouldn't even deal with them at all.  I guess that
would by my position.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-17 Thread Alban Hertroys
On 16 Nov 2010, at 23:46, Josh Berkus wrote:

 Folks,
 
 Please help us resolve a discussion on -hackers.
 
 PostgreSQL 9.1 is likely to have, as a feature, the ability to create
 tables which are unlogged, meaning that they are not added to the
 transaction log, and will be truncated (emptied) on database restart.
 Such tables are intended for highly volatile, but not very valuable,
 data, such as session statues, application logs, etc.
 
 The question is, how would you, as a DBA, expect pg_dump backups to
 treat unlogged tables? Backing them up by default has the potential to
 both cause performance drag on the unlogged table and make your backups
 take longer unless you remember to omit them. Not backing them up by
 default has the drawback that if you forget --include-unlogged switch,
 and shut the database down, any unlogged data is gone. How would you
 *expect* unlogged tables to behave?


From the discussion so far it appears to me that unlogged should probably be 
split into various gradations of unlogged. There appear to be a number of 
popular use-cases for such tables, with different requirements, namely:

1. Session tables
These tables contain data about a user session in some application. It is 
temporary data at best, it's no problem to lose it at all. Dumping it makes no 
sense.

2. Staging tables
These tables contain data that's being processed and prepared to be entered 
into other tables in the database. If the process fails it can usually be 
restarted, so losing the data is no problem. Here as well, dumping makes little 
sense.

3. Logging tables
This is data from application logs. It's not usually mission critical, so 
losing it isn't a very big deal, but it is useful data of itself. It should in 
most cases survive a normal backend restart, but if it doesn't survive a 
backend crash that's acceptable. This data should in most cases be included in 
dumps (or dumped separately?).

4. Materialized views
Stored results of a query that's likely to have a big footprint on system 
resources. Losing the data after a backend crash is acceptable, but it should 
survive a normal system restart. Since the data can be generated from the 
contents of the database, it's not necessary to include it in dumps (but maybe 
it is convenient in some cases?)

I think this is the gist of it.
Which leads me to think that people want three knobs to play with: should the 
table be logged or not? Can it be truncated at normal server restart or not? 
Should it be included in dumps or not? And possibly, should it be fsynced or 
not?

Of course, without WAL logs, PITR and WAL-based replication are out of the 
question for these tables. Also, since the data can be lost, they can't be 
referenced by foreign keys.

Does that sum it up adequately?


There's one thing that I didn't see mentioned and that I'm not sure fits into 
the picture here, namely read-only tables (materialized views would qualify for 
those in most cases).
These tables are written every once in a while by a system user, but it doesn't 
change in between at all. There's not much point in giving every user their own 
session, and it should be possible to assume all index entries point to a live 
record (which has consequences for COUNT(), for example).

Changing that has quite some implications though, I wager...

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4ce4daf610425035851824!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Josh Berkus
Folks,

Please help us resolve a discussion on -hackers.

PostgreSQL 9.1 is likely to have, as a feature, the ability to create
tables which are unlogged, meaning that they are not added to the
transaction log, and will be truncated (emptied) on database restart.
Such tables are intended for highly volatile, but not very valuable,
data, such as session statues, application logs, etc.

The question is, how would you, as a DBA, expect pg_dump backups to
treat unlogged tables? Backing them up by default has the potential to
both cause performance drag on the unlogged table and make your backups
take longer unless you remember to omit them. Not backing them up by
default has the drawback that if you forget --include-unlogged switch,
and shut the database down, any unlogged data is gone. How would you
*expect* unlogged tables to behave?

Survey is here:
https://spreadsheets.google.com/ccc?key=0AoeuP3g2YZsFdDFnT2VKNC1FQ0pQNmJGS2dWMTNYMEEhl=enauthkey=CISbwuYD


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Karsten Hilbert
 PostgreSQL 9.1 is likely to have, as a feature, the ability to create
 tables which are unlogged, meaning that they are not added to the
 transaction log, and will be truncated (emptied) on database restart.
 Such tables are intended for highly volatile, but not very valuable,
 data, such as session statues, application logs, etc.
 
 The question is, how would you, as a DBA, expect pg_dump backups to
 treat unlogged tables? Backing them up by default has the potential to
 both cause performance drag on the unlogged table and make your backups
 take longer unless you remember to omit them. Not backing them up by
 default has the drawback that if you forget --include-unlogged switch,
 and shut the database down, any unlogged data is gone. How would you
 *expect* unlogged tables to behave?

ALTER DATABASE ... SET PG_DUMP_INCLUDE_UNLOGGED TO ON/OFF

with default OFF.

That way I can think about it once per database *before* I am in
the situation when I regret forgetting.

(pg_dump would still support --include-unlogged, defaulting to the
database default)

Karsten
-- 
Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief!  
Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Derrick Rice
On Tue, Nov 16, 2010 at 5:46 PM, Josh Berkus j...@agliodbs.com wrote:


 Survey is here:

 https://spreadsheets.google.com/ccc?key=0AoeuP3g2YZsFdDFnT2VKNC1FQ0pQNmJGS2dWMTNYMEEhl=enauthkey=CISbwuYD


This is a link to a read-only spreadsheet for me.

Derrick


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Scott Ribe
On Nov 16, 2010, at 3:46 PM, Josh Berkus wrote:

  ...and will be truncated (emptied) on database restart.

I think that's key. Anything that won't survive a database restart, I sure 
don't expect to survive backup  restore.

-- 
Scott Ribe
scott_r...@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Glen Parker

On 11/16/2010 03:24 PM, Karsten Hilbert wrote:

PostgreSQL 9.1 is likely to have, as a feature, the ability to create
tables which are unlogged, meaning that they are not added to the
transaction log, and will be truncated (emptied) on database restart.
Such tables are intended for highly volatile, but not very valuable,
data, such as session statues, application logs, etc.


I have been following loosely this discussion on HACKERS, but seem to 
have missed the part about truncating such tables on server restart.


I have an immediate use for unlogged tables (application logs), but 
having them truncate after even a clean server restart would be a show 
stopper.  I keep log data for 2 months, and never back it up.  Having it 
disappear after a system melt down is acceptable, but not after a clean 
restart.  That would be utterly ridiculous!



As to the topic of the thread, I think pg_dump needs to dump unlogged 
tables by default.



-Glen


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Josh Berkus

 This is a link to a read-only spreadsheet for me.

You're correct.  Darn those Google unreadable links!

https://spreadsheets.google.com/viewform?formkey=dDFnT2VKNC1FQ0pQNmJGS2dWMTNYMEE6MQ

That should work.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Scott Mead
On Tue, Nov 16, 2010 at 7:21 PM, Glen Parker glene...@nwlink.com wrote:

 On 11/16/2010 03:24 PM, Karsten Hilbert wrote:

 PostgreSQL 9.1 is likely to have, as a feature, the ability to create
 tables which are unlogged, meaning that they are not added to the
 transaction log, and will be truncated (emptied) on database restart.
 Such tables are intended for highly volatile, but not very valuable,
 data, such as session statues, application logs, etc.


 I have been following loosely this discussion on HACKERS, but seem to have
 missed the part about truncating such tables on server restart.

 I have an immediate use for unlogged tables (application logs), but having
 them truncate after even a clean server restart would be a show stopper.  I
 keep log data for 2 months, and never back it up.  Having it disappear after
 a system melt down is acceptable, but not after a clean restart.  That would
 be utterly ridiculous!


+1  -- Is there a technical reason to do a TRUNCATE on restart?  I'd feel
better if I could just have unlogged tables that survive unless something
like a power-outage etc...  I'm in the exact same boat here, lots of big
logging tables that need to survive reboot, but are frustrating when it
comes to WAL generation.




 As to the topic of the thread, I think pg_dump needs to dump unlogged
 tables by default.

 -1 I disagree.  I'm fine with having the loaded weapon  pointed at my foot.

--Scott



 -Glen



 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Tom Lane
Scott Mead sc...@scottrmead.com writes:
 +1  -- Is there a technical reason to do a TRUNCATE on restart?  I'd feel
 better if I could just have unlogged tables that survive unless something
 like a power-outage etc...  I'm in the exact same boat here, lots of big
 logging tables that need to survive reboot, but are frustrating when it
 comes to WAL generation.

Keep in mind that these tables are *not* going to survive any type of
backend crash.  Maybe my perceptions are colored because I deal with
Postgres bugs all the time, but I think of backend crashes as pretty
common, certainly much more common than an OS-level crash.  I'm afraid
you may be expecting unlogged tables to be significantly more robust
than they really will be.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Andy Colson

On 11/16/2010 04:46 PM, Josh Berkus wrote:


PostgreSQL 9.1 is likely to have, as a feature, the ability to create
tables which are unlogged, meaning that they are not added to the
transaction log, and will be truncated (emptied) on database restart.
Such tables are intended for highly volatile, but not very valuable,
data, such as session statues, application logs, etc.


With the current patches, the data survives a restart just fine.

I'd like to vote for:
safe restart = save data
bad crashy restart = drop date

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Glen Parker

On 11/16/2010 05:15 PM, Tom Lane wrote:

Keep in mind that these tables are *not* going to survive any type of
backend crash.  Maybe my perceptions are colored because I deal with
Postgres bugs all the time, but I think of backend crashes as pretty
common, certainly much more common than an OS-level crash.  I'm afraid
you may be expecting unlogged tables to be significantly more robust
than they really will be.


But an individual backend crash != server restart, unless that's changed 
since 8.1 (yes, I'm still stuck on 8.1 :( )...  So if I, for example, 
kill -9 a backend that's busy updating a nonlogged table, the table 
could be corrupted, but it wouldn't be truncated (and could cause 
trouble) for possibly weeks until the postmaster is restarted. 
Conversely, even if no backend crash occurs whatsoever, all the 
nonlogged tables would be truncated after an orderly postmaster restart.


Just doesn't make sense to me.

-Glen


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Josh Berkus

 With the current patches, the data survives a restart just fine.

Per -hackers, that's not guarenteed.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Andy Colson

On 11/16/2010 07:33 PM, Josh Berkus wrote:



With the current patches, the data survives a restart just fine.


Per -hackers, that's not guarenteed.



Ah, I just read the thread on -hackers.  And yea, my system had 24 hours to 
write/flush/etc before I'd restarted it moments ago as a test.  I have NOT 
tested a bunch of writes and then quickly restarting PG.  I CAN report that 
given 24 hours, your data will survive a restart :-)

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Scott Marlowe
On Tue, Nov 16, 2010 at 5:23 PM, Scott Ribe scott_r...@elevated-dev.com wrote:
 On Nov 16, 2010, at 3:46 PM, Josh Berkus wrote:

  ...and will be truncated (emptied) on database restart.

 I think that's key. Anything that won't survive a database restart, I sure 
 don't expect to survive backup  restore.

I'd vote for backing up the schema of an unlogged table so it's there
on a restore.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Scott Mead
On Tue, Nov 16, 2010 at 8:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Scott Mead sc...@scottrmead.com writes:
  +1  -- Is there a technical reason to do a TRUNCATE on restart?  I'd feel
  better if I could just have unlogged tables that survive unless something
  like a power-outage etc...  I'm in the exact same boat here, lots of big
  logging tables that need to survive reboot, but are frustrating when it
  comes to WAL generation.

 Keep in mind that these tables are *not* going to survive any type of
 backend crash.


  Not surviving a crash is fine.  IMHO, if we'd lose data in myisam files,
I'm happy to lose them on pg nologging tables.  I just want it to survive a
stop / start operation.  The benefits (think of multi-host syslog
consolidation with FTS drools ) on these tables FAR outweigh the
off-chance that a crash will cause me some heartache.


 Maybe my perceptions are colored because I deal with
 Postgres bugs all the time, but I think of backend crashes as pretty
 common, certainly much more common than an OS-level crash.  I'm afraid
 you may be expecting unlogged tables to be significantly more robust
 than they really will be.



Bugs?  What bugs :)

  Honestly, I've only had a couple of *Prod* crashes (knocks on wood), but
the need to restart occurs every now and then.

--Scott






regards, tom lane



Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Glen Parker

On 11/16/2010 05:33 PM, Josh Berkus wrote:



With the current patches, the data survives a restart just fine.


Per -hackers, that's not guarenteed.



Which is fine.  If you choose to set a table to nonlogged, that implies 
that you accept the risk of corrupted data, or that you don't get it, 
in which case .  It should not however, imply that you want it all 
thrown out every so often for no good reason.


If you do wish to have the data tossed out for no good reason every so 
often, then there ought to be a separate attribute to control that.  I'm 
really having trouble seeing how such behavior would be desirable enough 
to ever have the server do it for you, on its terms rather than yours.


-Glen


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Josh Berkus

 If you do wish to have the data tossed out for no good reason every so
 often, then there ought to be a separate attribute to control that.  I'm
 really having trouble seeing how such behavior would be desirable enough
 to ever have the server do it for you, on its terms rather than yours.

I don't quite follow you.  The purpose of unlogged tables is for data
which is disposable in the event of downtime; the classic example is the
a user_session_status table.  In the event of a restart, all user
sessions are going to be invalid anyway.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Josh Berkus

 I'd vote for backing up the schema of an unlogged table so it's there
 on a restore.

The schema is always there.  What may or may not be there is the data.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Andy Colson

On 11/16/2010 07:55 PM, Josh Berkus wrote:



If you do wish to have the data tossed out for no good reason every so
often, then there ought to be a separate attribute to control that.  I'm
really having trouble seeing how such behavior would be desirable enough
to ever have the server do it for you, on its terms rather than yours.


I don't quite follow you.  The purpose of unlogged tables is for data
which is disposable in the event of downtime; the classic example is the
a user_session_status table.  In the event of a restart, all user
sessions are going to be invalid anyway.



Why?  If you dont blow away the sessions table, everything should be fine.

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Glen Parker

On 11/16/2010 05:55 PM, Josh Berkus wrote:



If you do wish to have the data tossed out for no good reason every so
often, then there ought to be a separate attribute to control that.  I'm
really having trouble seeing how such behavior would be desirable enough
to ever have the server do it for you, on its terms rather than yours.


I don't quite follow you.  The purpose of unlogged tables is for data
which is disposable in the event of downtime; the classic example is the
a user_session_status table.  In the event of a restart, all user
sessions are going to be invalid anyway.



As was already mentioned, application logs.  Unlogged tables would be 
perfect for that, provided they don't go *poof* every now and then for 
no good reason.  Nobody's going to be too heart broken if a handful of 
log records go missing, or get garbled, after a server crash or power 
outage.  Delete 'em all after every restart though, and that's a problem.


-Glen


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Survey on backing up unlogged tables: help us with PostgreSQL development!

2010-11-16 Thread Tom Lane
Man, the number of misunderstandings in this thread is staggering.
Let me try to explain what the proposed feature will and will not do.

1. The system catalog entries for all tables will be wal-logged.
So schema (DDL) will survive a crash.  There wouldn't be any way
to make it not do that, because we can't wal-log only some updates
to a particular table, and that includes the catalogs in particular.

2. What's proposed as the new feature is that specific non-system
tables can be marked as unlogged, meaning that WAL entries won't
be made for changes in those tables' contents (nor their indexes'
contents).  So we can't guarantee that the contents of such tables
will be correct or consistent after a crash.  The proposed feature
deals with this by forcibly truncating all such tables after a crash,
thus ensuring that they're consistent though not populated.  So the
possible use-cases for such tables are limited to where (a) you can
repopulate the tables on demand, or (b) you don't really care about
losing data on a crash.

3. There's a lot of wishful thinking here about what constitutes a
crash.  A backend crash *is* a crash, even if the postmaster keeps
going.  Data that had been in shared buffers doesn't get written out
in such a scenario (and if we tried, it might be corrupt anyway).  So
unlogged tables would be corrupt and in need of truncation after such an
event.  Obviously, the same goes for an OS-level crash or power failure.

4. The last bit of discussion on -hackers concerned what to do in
the case where the server got shut down cleanly.  If it was shut
down cleanly, then any data for unlogged tables would have been
written out from shared buffers ... but did the data make it to disk?
There's no easy way to know that.  In the event of an OS crash or
power failure shortly after server shutdown, it's possible that
the unlogged tables would be corrupt.  So Robert's initial proposal
includes truncating unlogged tables at any database startup, even
if the previous shutdown was clean.  Some (including me) are arguing
that that is unnecessarily strict; but you do have to realize that
you're taking some risk with data validity if it doesn't do that.

The bottom line here is that you really can only use the feature
for data that you're willing to accept losing on no notice.
Allowing the data to persist across clean shutdowns would probably
improve usability a bit, but it's not changing that fundamental fact.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general