Re: [HACKERS] Unlogged tables, persistent kind

2011-05-03 Thread Simon Riggs
On Tue, Apr 26, 2011 at 8:49 AM, Leonardo Francalanci m_li...@yahoo.it wrote:
  If that 1%  is random (not time/transaction related), usually you'd rather
have an empty  table.

 Why do you think it would be random?

 Heap blocks would be zeroed if they were found to be damaged, following a
 crash.

 If you erase full blocks, you have no idea what data you erased; it could be
 something changed 1 hour ago, 1 month ago, 1 year ago. This is very different
 from,
 say, synchronous_commit=off: in that case, the most recent transactions may 
 be
 lost if the database should crash. In your case, some (who knows which???)
 data
 is lost. So, to me that sounds like random loss. I don't think that that is
 different
 from a corrupted table. You're not deleting rows recently changed; you're
 deleting
 everything that is physically close to it.

  In other  words: is a table that is not consistant with anything else in 
  the
db  useful?

 That's too big a leap. Why would it suddenly be inconsistent with  the
 rest of the database?


 If you delete some data, and you have no idea what data you lost, I don't 
 think
 you have a
 consistent db. Unless, of course, your table has no relation with any other
 table in the db.

 Of course, all these thoughts are based on the assumption that I know what
 happens when a
 block is erased; but my knowledge of postgresql internals is not so good, so I
 might be
 *very* wrong

You're assuming that there are referential links *from* other tables
to the table with damage. In which case you would be correct. But of
course, if you needed that data for integrity you would never do that,
so the problem is a nonexistent use case. The suggested mode is for
Fact data, not reference tables.

The current assessment is that UNLOGGED tables are useful only for
running a data cache. If the database crashes, then the table is
truncated and you must refill the cache. If that is the case, then it
must surely be better to have a cache that is already 99% full, than
one which starts at empty. There is no damage or loss because parts of
the cache were missing.

Unlogged Tables are currently so volatile they are unusable for any
other purpose. I want to see a table that is useful for low value
data, such as sensor data.
If you had 10 TB of sensor data and the database crashes, then you
want to lose a few blocks, not the whole lot. Low value = rare, minor
loss is acceptable, but it doesn;t mean total data loss is acceptable.
For that use case, total loss is catastrophic, not just mildly
irritating. If you are a Telco, losing a few minutes billing data
costs much less than having every server have better hardware so it
can cope with high WAL traffic as well. They don't want to lose the
data, but its a cost based trade off. Consistency is not an issue, you
are just missing some data. That is normal anyway, since sensor data
generators (mobile devices etc) frequently fail, are offline, turned
off etc, so there isn't even a definition of what complete data is
supposed to look like. The missing data looks exactly like lots of
people turned their phones off for a few minutes.

So my suggestion makes UNLOGGED tables more useful for the use case
they were designed to address - cached data (AIUI), plus they allow
another use case that doesn't seem to be well understood, low value
data in massive data volumes.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-05-03 Thread Greg Stark
On Tue, May 3, 2011 at 8:21 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The current assessment is that UNLOGGED tables are useful only for
 running a data cache. If the database crashes, then the table is
 truncated and you must refill the cache. If that is the case, then it
 must surely be better to have a cache that is already 99% full, than
 one which starts at empty. There is no damage or loss because parts of
 the cache were missing.

That's not necessarily the case of course. I've written caches before
where a set of records would be present for each object being cached.
I could deal with a whole set not being present but if just some of
the records were missing then I would serve incorrect results. Since
they were created in a single transacion I should be able to rely on
the whole set being present or missing consistently.

That doesn't mean there aren't cases where that's true of course. In
an ideal world the database would guarantee that you couldn't use the
table in this way and fail to get the consistency you expect.
Something like getting an error if you try to modify two rows in an
unlogged table during a single transaction. I'm not sure how to do
that reasonably though.


 For that use case, total loss is catastrophic, not just mildly
 irritating. If you are a Telco, losing a few minutes billing data
 costs much less than having every server have better hardware so it
 can cope with high WAL traffic as well. They don't want to lose the
 data, but its a cost based trade off.

This analysis is dead on. That's precisely how businesses evaluate
this question.


 Consistency is not an issue, you
 are just missing some data. That is normal anyway, since sensor data
 generators (mobile devices etc) frequently fail, are offline, turned
 off etc, so there isn't even a definition of what complete data is
 supposed to look like. The missing data looks exactly like lots of
 people turned their phones off for a few minutes.

I don't think that's true however. Consider if I have a rollup table
that contains aggregated sums of that data. If you lose some of the
records then my aggregates don't add up to the correct values any
more.

Or consider if you are counting sensor data like total data
transferred and session count -- and then reporting the average data
transferred per session. And you accidentally lose a bunch of
sessions. Now your data/session report will be reporting false
information.

This is true even if you only lose recently committed rows. But losing
whole blocks means you risk losing random old data which makes it hard
to work around by, say, purging recently aggregated data.

 So my suggestion makes UNLOGGED tables more useful for the use case
 they were designed to address - cached data (AIUI), plus they allow
 another use case that doesn't seem to be well understood, low value
 data in massive data volumes.

There other approaches as well. Foreign data wrappers mean you could
do things like store the low value data in raw text files or other
systems like memcached or Hadoop or whatever. I'm not saying there's
no reason to do something in Postgres but if you're being bitten by
Postgres's block-oriented random access storage it may be a problem
too fundamental to solve without addressing the underlying storage
strategy?


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-05-03 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 You're assuming that there are referential links *from* other tables
 to the table with damage. In which case you would be correct. But of
 course, if you needed that data for integrity you would never do that,
 so the problem is a nonexistent use case. The suggested mode is for
 Fact data, not reference tables.

So I suppose your notion of fact data includes no fields that are wide
enough to need toasting?  Because as soon as you have any out-of-line
values, there's an equivalent of foreign keys behind the scenes, where
the user can't see it (until he gets missing chunk number or some such
error).

 The current assessment is that UNLOGGED tables are useful only for
 running a data cache. If the database crashes, then the table is
 truncated and you must refill the cache. If that is the case, then it
 must surely be better to have a cache that is already 99% full, than
 one which starts at empty. There is no damage or loss because parts of
 the cache were missing.

A cache that starts at 99% full of untrustworthy data is NOT better.

 Unlogged Tables are currently so volatile they are unusable for any
 other purpose. I want to see a table that is useful for low value
 data, such as sensor data.

Basically, you're being hopelessly optimistic both about the extent to
which a crash is likely to render data inconsistent, and our ability to
detect that inconsistency.  It doesn't matter whether the data is low
value, the difficulty of cleaning up remains the same.  I don't want to
deal with trying to detect that, and I definitely don't want to dump the
problems onto users.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-05-03 Thread Christopher Browne
On Tue, May 3, 2011 at 4:54 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 You're assuming that there are referential links *from* other tables
 to the table with damage. In which case you would be correct. But of
 course, if you needed that data for integrity you would never do that,
 so the problem is a nonexistent use case. The suggested mode is for
 Fact data, not reference tables.

 So I suppose your notion of fact data includes no fields that are wide
 enough to need toasting?  Because as soon as you have any out-of-line
 values, there's an equivalent of foreign keys behind the scenes, where
 the user can't see it (until he gets missing chunk number or some such
 error).

 The current assessment is that UNLOGGED tables are useful only for
 running a data cache. If the database crashes, then the table is
 truncated and you must refill the cache. If that is the case, then it
 must surely be better to have a cache that is already 99% full, than
 one which starts at empty. There is no damage or loss because parts of
 the cache were missing.

 A cache that starts at 99% full of untrustworthy data is NOT better.

That's a bit pessimistic.

The case that bugs me is that a cache that's 99% trustworthy, but
where I have no idea that:
a) 1% of it is crud, and
b) Which 1% of it is crud
is still a pretty unacceptable scenario.

I head back to our policy for handling caches:
  If in doubt, TRUNCATE.

That policy would be nicely consistent with the way 9.1 deals with
unlogged tables.

 Unlogged Tables are currently so volatile they are unusable for any
 other purpose. I want to see a table that is useful for low value
 data, such as sensor data.

 Basically, you're being hopelessly optimistic both about the extent to
 which a crash is likely to render data inconsistent, and our ability to
 detect that inconsistency.  It doesn't matter whether the data is low
 value, the difficulty of cleaning up remains the same.  I don't want to
 deal with trying to detect that, and I definitely don't want to dump the
 problems onto users.

+1, on both grounds.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-26 Thread Leonardo Francalanci
  If that 1%  is random (not time/transaction related), usually you'd rather 
have an empty  table.
 
 Why do you think it would be random?

Heap blocks would be zeroed if they were found to be damaged, following a 
crash.

If you erase full blocks, you have no idea what data you erased; it could be
something changed 1 hour ago, 1 month ago, 1 year ago. This is very different 
from,
say, synchronous_commit=off: in that case, the most recent transactions may be
lost if the database should crash. In your case, some (who knows which???) 
data
is lost. So, to me that sounds like random loss. I don't think that that is 
different
from a corrupted table. You're not deleting rows recently changed; you're 
deleting
everything that is physically close to it.

  In other  words: is a table that is not consistant with anything else in 
  the 
db  useful?
 
 That's too big a leap. Why would it suddenly be inconsistent with  the
 rest of the database?


If you delete some data, and you have no idea what data you lost, I don't think 
you have a
consistent db. Unless, of course, your table has no relation with any other 
table in the db.

Of course, all these thoughts are based on the assumption that I know what 
happens when a
block is erased; but my knowledge of postgresql internals is not so good, so I 
might be
*very* wrong

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-26 Thread Cédric Villemain
2011/4/25 Christopher Browne cbbro...@gmail.com:
 On Mon, Apr 25, 2011 at 2:03 PM, Jesper Krogh jes...@krogh.cc wrote:
 On 2011-04-25 20:00, Leonardo Francalanci wrote:
 The amount of data loss on a big table will be 1% of the data
 loss caused by truncating the whole table.

 If that 1% is random (not time/transaction related), usually you'd
 rather have an empty table. In other words: is a table that is not
 consistant with anything else in the db useful?

 Depends on the application, if it serves for pure caching then it is fully
 acceptable and way
 better than dropping everything.

 Whoah...

 When cacheing, the application already needs to be able to cope with
 the case where there's nothing in the cache.

 This means that if the cache gets truncated, it's reasonable to expect
 that the application won't get deranged - it already needs to cope
 with the case where data's not there and needs to get constructed.

That is true but the application performance has already to cope with
a server crash/restart. Many things you can add to make the restart
(for the application)  more 'smooth' is good.


 In contrast, if *wrong* data is in the cache, that could very well
 lead to wrong behavior on the part of the application.

 And there may not be any mechanism aside from cache truncation that
 will rectify that.

 It seems to me that it's a lot riskier to try to preserve contents of
 such tables than it is to truncate them.
 --
 When confronted by a difficult problem, solve it by reducing it to the
 question, How would the Lone Ranger handle this?

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Robert Haas
On Apr 24, 2011, at 1:22 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Unlogged tables are a good new feature.

Thanks.

 I noticed Bruce had mentioned they were the equivalent of NoSQL, which
 I don't really accept.

Me neither. I thought that was poorly said.

 Heap blocks would be zeroed if they were found to be damaged, following a 
 crash.

The problem is not so much the blocks that are damaged (e.g. half-written, torn 
page) but the ones that were never written at all. For example, read page A, 
read page B, update tuple on page A putting new version on page B, write one 
but not both of A and B out to the O/S, crash.  Everything on disk is a valid 
page, but they are not coherent taken as a whole.  It's normally XLOG replay 
that fixes this type of situation...

I thought about this problem a bit and I think you could perhaps deal with it 
by having some sort of partially logged table, where we would XLOG just enough 
to know which blocks or relations had been modified and only nuke enough data 
to be certain of being safe. But it isn't clear that there is much use case for 
this, especially because I think it would give up nearly all the performance 
benefit.

I do think it might be useful to have an unlogged index on a logged table, 
somehow frobnicated so that on a crash the index is known invalid and not used 
until a REINDEX is performed.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Leonardo Francalanci
The only data we can't rebuild it's the heap. So what about an option for 
UNlogged indexes on a LOGged table? It would always preserve data, and it would 
'only' cost a rebuilding of the indexes in case of an unclean shutdown. I think 
it would give a boost in performance for all those cases where the IO 
(especially random IO) is caused by the indexes, and it doesn't look too 
complicated (but maybe I'm missing something).

I proposed the unlogged to logged patch (BTW has anyone given a look at it?) 
because we partition data based on a timestamp, and we can risk loosing the 
last N minutes of data, but after N minutes we want to know data will always be 
there, so we would like to set a partition table to 'logged'. 

Leonardo

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Simon Riggs
On Mon, Apr 25, 2011 at 8:36 AM, Leonardo Francalanci m_li...@yahoo.it wrote:
 The only data we can't rebuild it's the heap. So what about an option for 
 UNlogged indexes on a LOGged table? It would always preserve data, and it 
 would 'only' cost a rebuilding of the indexes in case of an unclean shutdown. 
 I think it would give a boost in performance for all those cases where the IO 
 (especially random IO) is caused by the indexes, and it doesn't look too 
 complicated (but maybe I'm missing something).

 I proposed the unlogged to logged patch (BTW has anyone given a look at it?) 
 because we partition data based on a timestamp, and we can risk loosing the 
 last N minutes of data, but after N minutes we want to know data will always 
 be there, so we would like to set a partition table to 'logged'.

I agree that unlogged indexes on a logged heap are better for
resilience and are likely to be the best first step.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Simon Riggs
On Mon, Apr 25, 2011 at 8:14 AM, Robert Haas robertmh...@gmail.com wrote:
 On Apr 24, 2011, at 1:22 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Unlogged tables are a good new feature.

 Thanks.

 I noticed Bruce had mentioned they were the equivalent of NoSQL, which
 I don't really accept.

 Me neither. I thought that was poorly said.

 Heap blocks would be zeroed if they were found to be damaged, following a 
 crash.

 The problem is not so much the blocks that are damaged (e.g. half-written, 
 torn page) but the ones that were never written at all. For example, read 
 page A, read page B, update tuple on page A putting new version on page B, 
 write one but not both of A and B out to the O/S, crash.  Everything on disk 
 is a valid page, but they are not coherent taken as a whole.  It's normally 
 XLOG replay that fixes this type of situation...

Not really sure it matters what the cause of data loss is, does it?
The zeroing of the blocks definitely causes data loss but the
intention is to bring the table back to a consistent physical state,
not to in any way repair the data loss.

Repeating my words above, this proposed option trades potential minor
data loss for performance.

The amount of data loss on a big table will be 1% of the data loss
caused by truncating the whole table.

This is important on big tables where reloading from a backup might
take a long time.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Robert Haas
On Mon, Apr 25, 2011 at 5:04 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, Apr 25, 2011 at 8:14 AM, Robert Haas robertmh...@gmail.com wrote:
 On Apr 24, 2011, at 1:22 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Unlogged tables are a good new feature.

 Thanks.

 I noticed Bruce had mentioned they were the equivalent of NoSQL, which
 I don't really accept.

 Me neither. I thought that was poorly said.

 Heap blocks would be zeroed if they were found to be damaged, following a 
 crash.

 The problem is not so much the blocks that are damaged (e.g. half-written, 
 torn page) but the ones that were never written at all. For example, read 
 page A, read page B, update tuple on page A putting new version on page B, 
 write one but not both of A and B out to the O/S, crash.  Everything on disk 
 is a valid page, but they are not coherent taken as a whole.  It's normally 
 XLOG replay that fixes this type of situation...

 Not really sure it matters what the cause of data loss is, does it?
 The zeroing of the blocks definitely causes data loss but the
 intention is to bring the table back to a consistent physical state,
 not to in any way repair the data loss.

Right, but the trick is how you identify which blocks you need to
zero.  You used the word damaged, which to me implied that the block
had been modified in some way but ended up with other than the
expected contents, so that something like a CRC check might detect the
problem.  My point (as perhaps you already understand) is that you
could easily have a situation where every block in the table passes a
hypothetical block-level CRC check, but the table as a whole is still
damaged because update chains aren't coherent.  So you need some kind
of mechanism for identifying which portions of the table you need to
zero to get back to a guaranteed-coherent state.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Robert Haas
On Mon, Apr 25, 2011 at 3:36 AM, Leonardo Francalanci m_li...@yahoo.it wrote:
 The only data we can't rebuild it's the heap. So what about an option for 
 UNlogged indexes on a LOGged table? It would always preserve data, and it 
 would 'only' cost a rebuilding of the indexes in case of an unclean shutdown. 
 I think it would give a boost in performance for all those cases where the IO 
 (especially random IO) is caused by the indexes, and it doesn't look too 
 complicated (but maybe I'm missing something).

+1.

 I proposed the unlogged to logged patch (BTW has anyone given a look at it?) 
 because we partition data based on a timestamp, and we can risk loosing the 
 last N minutes of data, but after N minutes we want to know data will always 
 be there, so we would like to set a partition table to 'logged'.

That approach is something I had also given some thought to, and I'm
glad to hear that people are thinking about doing it in the real
world.  I'm planning to look at your patch, but I haven't gotten to it
yet, because I'm giving priority to anything that must be done to get
9.1beta1 out the door.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Leonardo Francalanci
 The amount of data loss on a big
 table will be 1% of the data loss
caused by truncating the whole table.

If that 1% is random (not time/transaction related), usually you'd rather have 
an empty table. In other words: is a table that is not consistant with anything 
else in the db useful?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Jesper Krogh

On 2011-04-25 20:00, Leonardo Francalanci wrote:

 The amount of data loss on a big table will be 1% of the data
 loss caused by truncating the whole table.

 If that 1% is random (not time/transaction related), usually you'd
 rather have an empty table. In other words: is a table that is not
 consistant with anything else in the db useful?

Depends on the application, if it serves for pure caching then it is 
fully acceptable and way

better than dropping everything.

--
Jesper



Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Kevin Grittner
Jesper Krogh jes...@krogh.cc wrote:
 On 2011-04-25 20:00, Leonardo Francalanci wrote:
 The amount of data loss on a big table will be 1% of the data
 loss caused by truncating the whole table.

  If that 1% is random (not time/transaction related), usually
  you'd rather have an empty table. In other words: is a table
  that is not consistant with anything else in the db useful?

 Depends on the application, if it serves for pure caching then it
 is fully acceptable and way better than dropping everything.
 
I buy this *if* we can be sure we're not keeping information which
is duplicated or mangled, and if we can avoid crashing the server to
a panic because of broken pointers or other infelicities.  I'm not
sure that can't be done, but I don't think I've heard an explanation
of how that could be accomplished, particularly without overhead
which would wipe out the performance benefit of unlogged tables. 
(And without a performance benefit, what's the point?)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Christopher Browne
On Mon, Apr 25, 2011 at 2:03 PM, Jesper Krogh jes...@krogh.cc wrote:
 On 2011-04-25 20:00, Leonardo Francalanci wrote:
 The amount of data loss on a big table will be 1% of the data
 loss caused by truncating the whole table.

 If that 1% is random (not time/transaction related), usually you'd
 rather have an empty table. In other words: is a table that is not
 consistant with anything else in the db useful?

 Depends on the application, if it serves for pure caching then it is fully
 acceptable and way
 better than dropping everything.

Whoah...

When cacheing, the application already needs to be able to cope with
the case where there's nothing in the cache.

This means that if the cache gets truncated, it's reasonable to expect
that the application won't get deranged - it already needs to cope
with the case where data's not there and needs to get constructed.

In contrast, if *wrong* data is in the cache, that could very well
lead to wrong behavior on the part of the application.

And there may not be any mechanism aside from cache truncation that
will rectify that.

It seems to me that it's a lot riskier to try to preserve contents of
such tables than it is to truncate them.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Simon Riggs
On Mon, Apr 25, 2011 at 7:00 PM, Leonardo Francalanci m_li...@yahoo.it wrote:
 The amount of data loss on a big
 table will be 1% of the data loss
caused by truncating the whole table.

 If that 1% is random (not time/transaction related), usually you'd rather 
 have an empty table.

Why do you think it would be random?


 In other words: is a table that is not consistant with anything else in the 
 db useful?

That's too big a leap. Why would it suddenly be inconsistent with the
rest of the database?


Not good arguments.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Simon Riggs
On Mon, Apr 25, 2011 at 1:42 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Apr 25, 2011 at 5:04 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, Apr 25, 2011 at 8:14 AM, Robert Haas robertmh...@gmail.com wrote:
 On Apr 24, 2011, at 1:22 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Unlogged tables are a good new feature.

 Thanks.

 I noticed Bruce had mentioned they were the equivalent of NoSQL, which
 I don't really accept.

 Me neither. I thought that was poorly said.

 Heap blocks would be zeroed if they were found to be damaged, following a 
 crash.

 The problem is not so much the blocks that are damaged (e.g. half-written, 
 torn page) but the ones that were never written at all. For example, read 
 page A, read page B, update tuple on page A putting new version on page B, 
 write one but not both of A and B out to the O/S, crash.  Everything on 
 disk is a valid page, but they are not coherent taken as a whole.  It's 
 normally XLOG replay that fixes this type of situation...

 Not really sure it matters what the cause of data loss is, does it?
 The zeroing of the blocks definitely causes data loss but the
 intention is to bring the table back to a consistent physical state,
 not to in any way repair the data loss.

 Right, but the trick is how you identify which blocks you need to
 zero.  You used the word damaged, which to me implied that the block
 had been modified in some way but ended up with other than the
 expected contents, so that something like a CRC check might detect the
 problem.  My point (as perhaps you already understand) is that you
 could easily have a situation where every block in the table passes a
 hypothetical block-level CRC check, but the table as a whole is still
 damaged because update chains aren't coherent.  So you need some kind
 of mechanism for identifying which portions of the table you need to
 zero to get back to a guaranteed-coherent state.

That sounds like progress.

The current mechanism is truncate complete table. There are clearly
other mechanisms that would not remove all data.

Probably the common case would be for insert-only data.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Robert Haas
On Mon, Apr 25, 2011 at 2:21 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Right, but the trick is how you identify which blocks you need to
 zero.  You used the word damaged, which to me implied that the block
 had been modified in some way but ended up with other than the
 expected contents, so that something like a CRC check might detect the
 problem.  My point (as perhaps you already understand) is that you
 could easily have a situation where every block in the table passes a
 hypothetical block-level CRC check, but the table as a whole is still
 damaged because update chains aren't coherent.  So you need some kind
 of mechanism for identifying which portions of the table you need to
 zero to get back to a guaranteed-coherent state.

 That sounds like progress.

 The current mechanism is truncate complete table. There are clearly
 other mechanisms that would not remove all data.

No doubt.  Consider a block B.  If the system crashes when block B is
dirty either in the OS cache or shared_buffers, then you must zero B,
or truncate it away.  If it was clean in both places, however, it's
good data and you can keep it.

So you can imagine for example a scheme where imagine that the
relation is divided into 8MB chunks, and we WAL-log the first
operation after each checkpoint that touches a chunk.  Replay zeroes
the chunk, and we also invalidate all the indexes (the user must
REINDEX to get them working again).  I think that would be safe, and
certainly the WAL-logging overhead would be far less than WAL-logging
every change, since we'd need to emit only ~16 bytes of WAL for every
8MB written, rather than ~8MB of WAL for every 8MB written.  It
wouldn't allow some of the optimizations that the current unlogged
tables can get away with only because they WAL-log exactly nothing -
and selectively zeroing chunks of a large table might slow down
startup quite a bit - but it might still be useful to someone.

However, I think that the logged table, unlogged index idea is
probably the most promising thing to think about doing first.  It's
easy to imagine all sorts of uses for that sort of thing even in cases
where people can't afford to have any data get zeroed, and it would
provide a convenient building block for something like the above if we
eventually wanted to go that way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-25 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 However, I think that the logged table, unlogged index idea is
 probably the most promising thing to think about doing first.

+1 for that --- it's clean, has a clear use-case, and would allow us
to manage the current mess around hash indexes more cleanly.
That is, hash indexes would always be treated as unlogged.

(Or of course we could fix the lack of WAL logging for hash indexes,
but I notice a lack of people stepping up to do that.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Unlogged tables, persistent kind

2011-04-24 Thread Simon Riggs
Unlogged tables are a good new feature.

I noticed Bruce had mentioned they were the equivalent of NoSQL, which
I don't really accept. I guess it depends upon whether you mean NoSQL
for caches (e.g. memcached) or NoSQL for useful datastores (e.g.
Mongo). It seems worth discussin now before we get too far into the
marketing hype around Beta.

If you don't log changes to tables you have two choices if we crash
1) truncate the table and any indexes
2) rebuild any indexes damaged by the crash

Currently, we do (1). That certainly has its place but most data
stores don't do this if they crash, since it would lead to data loss.
Not just a couple of rows either - serious, major data loss if you put
the wrong kind of data in it. We even delete data that has been safely
on disk for weeks, months, which IMHO some people could easily get
confused about.

In the future, I would like to work on (2), which preserves as much
data as possible, while recognising indexes may be damaged. I don't
really have any name for this, since the current naming seems to
assume there is only one kind of unlogged table.

My implementation path for that would be to add a crash_number onto
pg_control and pg_index. Any index marked as unlogged, persistent
would only be usable if it's crash number is the same as current
system crash number.

REINDEX would update the index crash number to current value. That
also allows us to imagine a repair index command in the future as
well.

Heap blocks would be zeroed if they were found to be damaged, following a crash.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-24 Thread Greg Stark
On Sun, Apr 24, 2011 at 6:22 PM, Simon Riggs si...@2ndquadrant.com wrote:
 My implementation path for that would be to add a crash_number onto
 pg_control and pg_index. Any index marked as unlogged, persistent
 would only be usable if it's crash number is the same as current
 system crash number.

 REINDEX would update the index crash number to current value. That
 also allows us to imagine a repair index command in the future as
 well.

This seems useful for non-crash-safe indexes in general.

 Heap blocks would be zeroed if they were found to be damaged, following a 
 crash.


How do you propose to detect that? Until we solve the whole checksum
story I don't think we have a reliable way to detect bad pages. And in
some cases where do detect them we would detect them by crashing.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-24 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 If you don't log changes to tables you have two choices if we crash
 1) truncate the table and any indexes
 2) rebuild any indexes damaged by the crash

No, you have only one choice, and that's (1), because there's no
guarantee that what's in the table file is meaningful.

 Heap blocks would be zeroed if they were found to be damaged, following a 
 crash.

This is sheerest fantasy.  And even if you could implement it, what sort
of feature would you be offering?  Your data is preserved except when
it isn't?  People who want that can go use mysql.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-24 Thread Simon Riggs
On Sun, Apr 24, 2011 at 10:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 If you don't log changes to tables you have two choices if we crash
 1) truncate the table and any indexes
 2) rebuild any indexes damaged by the crash

 No, you have only one choice, and that's (1), because there's no
 guarantee that what's in the table file is meaningful.

 Heap blocks would be zeroed if they were found to be damaged, following a 
 crash.

 This is sheerest fantasy.  And even if you could implement it, what sort
 of feature would you be offering?  Your data is preserved except when
 it isn't?  People who want that can go use mysql.

AFAIUI, a great many people do.

I am proposing a non-default mode, requiring explicit activation by
user which preserves as much data as possible. I am fully aware that
what is proposed is not an optimisation, but a downgrading of normal
resilience in exchange for some data loss in the event of a crash.

Yes, many other systems support this and people are becoming persuaded
that such risk/reward choices make sense for them.
I see no reason not to provide an option to do this, so people can
make informed choices.

For large sets of low value data, it makes sense. Deleting all data,
just simply because some of it might be damaged, is not the only
option. IMHO deleting all the data is a surprising option that will
cause many people to curse us. I don't see preserving some of the data
as being worse.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-24 Thread Christopher Browne
On Sun, Apr 24, 2011 at 6:15 PM, Simon Riggs si...@2ndquadrant.com wrote:
 For large sets of low value data, it makes sense. Deleting all data,
 just simply because some of it might be damaged, is not the only
 option. IMHO deleting all the data is a surprising option that will
 cause many people to curse us. I don't see preserving some of the data
 as being worse.

For the cache table case, it is *certainly* reasonable to delete
everything upon discovering the risk that some might be damaged.

I have seen cases in production where the fix to 'oh, looks like
something's corrupted' is indeed truncate the cache, where that has
become a Standard Operating Procedure.

Sure, to have Postgres do this is a bit heavy-handed, but it's not as
if this isn't going to be heavily documented with warnings like:
Contains live plague bacteria.
Beware the Rabid Hippopotami.  May cause bleeding at the eyes.  If
your database crashes, this table WILL get truncated at startup time.
 If the docs are short on warnings, that should probably get
rectified.  (I'm allowed to be volunteered to do so :-).)

I'd actually find it unsurprising for such tables to get truncated
even on a clean restart; I'd favor that being an option, as along as
make it thus were generally unproblematic.  If the application using
such a table can't cope with that logic, better to induce an
understanding of that sooner rather than later ;-).

It seems like a losing battle to try terribly hard to keep the data
around when, by marking it unlogged, the data definition specifically
warned that this was risky.

I'd not go so far as to suggest having autovac TRUNCATE such tables at
random intervals, but that's a pathology that's not completely
incompatible with the DDL declaration :-)
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-24 Thread Simon Riggs
On Sun, Apr 24, 2011 at 7:41 PM, Greg Stark gsst...@mit.edu wrote:
 On Sun, Apr 24, 2011 at 6:22 PM, Simon Riggs si...@2ndquadrant.com wrote:
 My implementation path for that would be to add a crash_number onto
 pg_control and pg_index. Any index marked as unlogged, persistent
 would only be usable if it's crash number is the same as current
 system crash number.

 REINDEX would update the index crash number to current value. That
 also allows us to imagine a repair index command in the future as
 well.

 This seems useful for non-crash-safe indexes in general.

 Heap blocks would be zeroed if they were found to be damaged, following a 
 crash.


 How do you propose to detect that? Until we solve the whole checksum
 story I don't think we have a reliable way to detect bad pages. And in
 some cases where do detect them we would detect them by crashing.


That should be changed.


-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unlogged tables, persistent kind

2011-04-24 Thread Greg Stark
On Sun, Apr 24, 2011 at 11:15 PM, Simon Riggs si...@2ndquadrant.com wrote:
 IMHO deleting all the data is a surprising option that will
 cause many people to curse us. I don't see preserving some of the data
 as being worse.

What possible damage to you want to recover from?

Without WAL logging after a software crash it's possible for update
chains to be broken, for multiple copies of the same tuple to be
visible, for some tuples to disappear but not others, etc.

And without checksums after a hardware crash it'll be possible for
pages to be torn resulting in tuple pointers that land in the middle
of nowhere or tuples that start off fine but are half overwritten with
unrelated garbage.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers