On 5 March 2014 18:22, Israel Brewster <isr...@eraalaska.net> wrote:

> I have a Postgresql 9.2.4 database containing real-time tracking data for
> our aircraft for the past week (theoretically). It is populated by two
> different processes: one that runs every few minutes, retrieving data from
> a number of sources and storing it in the DB, and one that has an "always
> on" connection to the DB streaming data into the database in realtime
> (often several records per second). To keep the database size manageable I
> have a cron job that runs every evening to delete all records that are more
> than a week old, after archiving a subset of them in permanent storage.
>
> This morning my boss e-mailed me, complaining that only a couple of
> aircraft were showing up in the list (SELECT distinct(tail) FROM data being
> the command that populates the list). Upon looking at the data I saw that
> it only went back to 4am this morning, rather than the week I was
> expecting. My first thought was "Oh, I must have a typo in my cleanup
> routine, such that it is deleting all records rather than only those a week
> old, and it's just that no one has noticed until now". So I looked at that,
> but changing the delete to a select appeared to produce the proper results,
> in that no records were selected:
>
> DELETE FROM data WHERE pointtime<now() AT TIME ZONE 'UTC'-interval '7
> days';
>
> Then I noticed something even more odd. My database has an id column,
> which is defined as a SERIAL. As we all know, a serial is a monotonically
> increasing number that is not affected by deletes. However, the oldest
> record in my database, from 4am this morning, had an id of 1. Even though I
> KNOW there was data in the system yesterday. Even if my DELETE command was
> wrong and deleted ALL records, that shouldn't have reset the SERIAL column
> to 1! I also know that I have not been in the database mucking around with
> the sequence value - to be completely honest, I don't even know the exact
> command to reset it - I'd have to google it if I wanted to.
>
> Also odd is that my cleanup script runs at 1am. I have records of there
> being new data in the database up to 3:51am, but the oldest record
> currently in the DB is from 4:45am (as specified by the default of now() on
> the column). So I know records were added after my delete command ran, but
> before this reset occurred.
>
> So my question is, aside from someone going in and mucking about in the
> wee hours of the morning, what could possibly cause this behavior? What
> sort of event could cause all data to be deleted from the table, and the
> sequence to be reset? Especially while there is an active connection?
> Thanks for any ideas, however wild or off the wall :-)
>

That is odd.  Even if it were an unlogged table, and there was a crash, the
sequence wouldn't reset, and even if it was running in a very long-running
transaction held open by a buggy connection pooler, the sequence would
still progress as it's immune to the effects of transactions.

So if all the data went missing, and the sequence reset, the only thing I
can think of is:

Someone ran:

TRUNCATE data RESTART IDENTITY;

or someone restored the table structure from a backup that deleted the
original table.

Do you log DDL?

Was the table partitioned?

You should also really be on 9.2.7, although I can't think of any bug
that's been fixed which could be responsible for this issue.

-- 
Thom

Reply via email to