On Tue, Apr 26, 2011 at 8:49 AM, Leonardo Francalanci <m_li...@yahoo.it> wrote: >> > If that 1% is random (not time/transaction related), usually you'd rather >>have an empty table. >> >> Why do you think it would be random? > > "Heap blocks would be zeroed if they were found to be damaged, following a > crash." > > If you erase full blocks, you have no idea what data you erased; it could be > something changed 1 hour ago, 1 month ago, 1 year ago. This is very different > from, > say, synchronous_commit=off: in that case, "the most recent transactions may > be > lost if the database should crash". In your case, "some (who knows which???) > data > is lost". So, to me that sounds like random loss. I don't think that that is > different > from a corrupted table. You're not deleting rows recently changed; you're > deleting > everything that is "physically close" to it. > >> > In other words: is a table that is not consistant with anything else in >> > the >>db useful? >> >> That's too big a leap. Why would it suddenly be inconsistent with the >> rest of the database? > > > If you delete some data, and you have no idea what data you lost, I don't > think > you have a > consistent db. Unless, of course, your table has no relation with any other > table in the db. > > Of course, all these thoughts are based on the assumption that I know what > happens when a > block is erased; but my knowledge of postgresql internals is not so good, so I > might be > *very* wrong
You're assuming that there are referential links *from* other tables to the table with damage. In which case you would be correct. But of course, if you needed that data for integrity you would never do that, so the problem is a nonexistent use case. The suggested mode is for Fact data, not reference tables. The current assessment is that UNLOGGED tables are useful only for running a data cache. If the database crashes, then the table is truncated and you must refill the cache. If that is the case, then it must surely be better to have a cache that is already 99% full, than one which starts at empty. There is no damage or loss because parts of the cache were missing. Unlogged Tables are currently so volatile they are unusable for any other purpose. I want to see a table that is useful for low value data, such as sensor data. If you had 10 TB of sensor data and the database crashes, then you want to lose a few blocks, not the whole lot. Low value => rare, minor loss is acceptable, but it doesn;t mean total data loss is acceptable. For that use case, total loss is catastrophic, not just mildly irritating. If you are a Telco, losing a few minutes billing data costs much less than having every server have better hardware so it can cope with high WAL traffic as well. They don't want to lose the data, but its a cost based trade off. Consistency is not an issue, you are just missing some data. That is normal anyway, since sensor data generators (mobile devices etc) frequently fail, are offline, turned off etc, so there isn't even a definition of what complete data is supposed to look like. The missing data looks exactly like lots of people turned their phones off for a few minutes. So my suggestion makes UNLOGGED tables more useful for the use case they were designed to address - cached data (AIUI), plus they allow another use case that doesn't seem to be well understood, low value data in massive data volumes. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers