Re: [HACKERS] Unlogged tables, persistent kind

Simon Riggs Tue, 03 May 2011 12:22:04 -0700

On Tue, Apr 26, 2011 at 8:49 AM, Leonardo Francalanci <m_li...@yahoo.it> wrote:
>> > If that 1%  is random (not time/transaction related), usually you'd rather
>>have an empty  table.
>>
>> Why do you think it would be random?
>
> "Heap blocks would be zeroed if they were found to be damaged, following a
> crash."
>
> If you erase full blocks, you have no idea what data you erased; it could be
> something changed 1 hour ago, 1 month ago, 1 year ago. This is very different
> from,
> say, synchronous_commit=off: in that case, "the most recent transactions may 
> be
> lost if the database should crash". In your case, "some (who knows which???)
> data
> is lost". So, to me that sounds like random loss. I don't think that that is
> different
> from a corrupted table. You're not deleting rows recently changed; you're
> deleting
> everything that is "physically close" to it.
>
>> > In other  words: is a table that is not consistant with anything else in 
>> > the
>>db  useful?
>>
>> That's too big a leap. Why would it suddenly be inconsistent with  the
>> rest of the database?
>
>
> If you delete some data, and you have no idea what data you lost, I don't 
> think
> you have a
> consistent db. Unless, of course, your table has no relation with any other
> table in the db.
>
> Of course, all these thoughts are based on the assumption that I know what
> happens when a
> block is erased; but my knowledge of postgresql internals is not so good, so I
> might be
> *very* wrong


You're assuming that there are referential links *from* other tables
to the table with damage. In which case you would be correct. But of
course, if you needed that data for integrity you would never do that,
so the problem is a nonexistent use case. The suggested mode is for
Fact data, not reference tables.

The current assessment is that UNLOGGED tables are useful only for
running a data cache. If the database crashes, then the table is
truncated and you must refill the cache. If that is the case, then it
must surely be better to have a cache that is already 99% full, than
one which starts at empty. There is no damage or loss because parts of
the cache were missing.

Unlogged Tables are currently so volatile they are unusable for any
other purpose. I want to see a table that is useful for low value
data, such as sensor data.
If you had 10 TB of sensor data and the database crashes, then you
want to lose a few blocks, not the whole lot. Low value => rare, minor
loss is acceptable, but it doesn;t mean total data loss is acceptable.
For that use case, total loss is catastrophic, not just mildly
irritating. If you are a Telco, losing a few minutes billing data
costs much less than having every server have better hardware so it
can cope with high WAL traffic as well. They don't want to lose the
data, but its a cost based trade off. Consistency is not an issue, you
are just missing some data. That is normal anyway, since sensor data
generators (mobile devices etc) frequently fail, are offline, turned
off etc, so there isn't even a definition of what complete data is
supposed to look like. The missing data looks exactly like lots of
people turned their phones off for a few minutes.

So my suggestion makes UNLOGGED tables more useful for the use case
they were designed to address - cached data (AIUI), plus they allow
another use case that doesn't seem to be well understood, low value
data in massive data volumes.

-- 
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unlogged tables, persistent kind

Reply via email to