Alban, Its a simple financial transaction processing application, the application permits editing / updating / deleting of entered data even multiple times but audit trail of the data tracing through all versions to its original must be preserved. (as outlined - Programmatically i could approach it by keeping a parallel set of tables and copying the row being replaced into the parallel table set, or, keeping all record versions in a single table only and a flag to indicate the final / current version) I am looking is there are better ways to do it
with warm regards Sanjay Minni +91-9900-902902 On Tue, 16 Nov 2021 at 15:57, Alban Hertroys <haram...@gmail.com> wrote: > > > On 16 Nov 2021, at 10:20, Laurenz Albe <laurenz.a...@cybertec.at> wrote: > > > > On Tue, 2021-11-16 at 13:56 +0530, Sanjay Minni wrote: > >> I need to keep a copy of old data as the rows are changed. > >> > >> For a general RDBMS I could think of keeping all the data in the same > table with a flag > >> to indicate older copies of updated / deleted rows or keep a parallel > table and copy > >> these rows into the parallel data under program / trigger control. Each > has its pros and cons. > >> > >> In Postgres would i have to follow the same methods or are there any > features / packages available ? > > > > Yes, I would use one of these methods. > > > > The only feature I can think of that may help is partitioning: if you > have one partition > > for the current data and one for the deleted data, then updating the > flag would > > automatically move the row between partitions, so you don't need a > trigger. > > Are you building (something like) a data-vault? If so, keep in mind that > you will have a row for every update, not just a single deleted row. > Enriching the data can be really useful in such cases. > > For a data-vault at a previous employer, we determined how to treat new > rows by comparing a (md5) hash of the new and old rows, adding the hash and > a validity interval to the stored rows. Historic data went to a separate > table for each respective current table. > > The current tables “inherited” the PK’s from the tables on the source > systems (this was a data-warehouse DB). Obviously that same PK can not be > applied to the historic tables where there _will_ be duplicates, although > they should be at non-overlapping validity intervals. > > Alternatively, since this is time-series data, it would probably be a good > idea to store that in a way optimised for that. TimescaleDB comes to mind, > or arrays as per Pavel’s suggestion at > https://stackoverflow.com/questions/68440130/time-series-data-on-postgresql > . > > Regards, > > Alban Hertroys > -- > If you can't see the forest for the trees, > cut the trees and you'll find there is no forest. > >