On 09/19/2016 08:48 AM, Willy-Bas Loos wrote:
Since records can
be changed afterwards, it has been argued that we should have
"journaling", meaning that every change to the data is saved in a
separate schema that holds a "journaling" copy of each table


I don't think this is especially unusual. I've worked on similar projects that maintain history for regulatory reasons. I can't speak to your legal situation, but I don't think keeping history is a strange choice.

Re the security argument: it seems to me it depends on whether you can restrict access to the journal while providing access to the normal tables. I guess journaling would help for some threats but not others, right? If regular users can't touch the journal, then I agree keeping one could make sense.

Are you saying the journaling portion is 624GB on top of 1.1TB? Or that of the 1.1TB, 624GB of it is from journaling? Either way it doesn't seem like a massive cost to me.

I don't think PITR is an appropriate solution to keeping a 10-year history of changes.

It sounds like you're not looking for a green-field solution, but just trying to get perspective on what others are doing. Some resources for this that might help you:

https://www.youtube.com/watch?v=TRgni5q0YM8
https://github.com/arkhipov/temporal_tables (aka http://pgxn.org/dist/temporal_tables/)
http://www.cs.arizona.edu/~rts/tdbbook.pdf (also available in print)
https://www.amazon.com/Bitemporal-Data-Practice-Tom-Johnston/dp/0124080677

The two books contain multiple approaches to storing history, each with tradeoffs for ease-of-use, disk space, etc. Reading them might be overkill for you. If you want to choose one, I'd read Snodgrass. The first few chapters are tedious, but it gets better.

Temporal databases are in a funny in-between zone where there is lots of research, but standard tools are fairly underdeveloped. Postgres recently added range types and exclusion constraints, which are important primitives for building a temporal system, but it is still a long way from SQL:2011, and SQL:2011 is itself a long way from everything you might want. One thing that seems lacking to me, even in the research, is how to handle DDL changes. You should be glad that you only care about audit history and not subject history too, because going bi-temporal is where you really cross over into lack of available tools, outside of a few commercial offerings. (Teradata has temporal support, using a Snodgrass-like approach that pre-dates the standard.)

Paul



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to