Hi Peter, and thanks for answering.
> > I'm unclear as to the "philosophical" distinctions here and would be
> > grateful if anybody could explain the diffence(s) between them?
> A time series database contains data records containing a time stamp
> A temporal database contains records that have a timestamp range
OK - so we have two stock "ticker" dbs - one a Time Series db (TS-db) and one a
Temporal db (T-db).
All of the code below is on a fiddle here
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=207f38e5c7d6e7861c402c2c4926840c
CREATE TABLE ts_portfolio -- time series
(
txn_id INTEGER NOT NULL,
company TEXT NOT NULL,
price INTEGER NOT NULL,
txn_ts TIMESTAMP NOT NULL
);
with a few records
INSERT INTO ts_portfolio VALUES
(1234, 'DB', 10, '07/11/21 00:12:00'), -- record inserted on
07/11/21
(2345, 'DB', 20, '08/11/21 00:12:00'), -- record inserted on
08/11/21
(3456, 'DB', 30, '09/11/21 00:12:00'), -- record inserted on
09/11/21
(4567, 'DB', 40, '10/11/21 00:12:00'), -- record inserted on
10/11/21
(5678, 'DB', 50, '11/11/21 00:12:00'); -- record inserted on
11/11/21
So, for example, in order to query the share price AS OF 00:00:00 on the 09/11
SELECT company, price
FROM ts_portfolio
WHERE company = 'DB'
AND txn_ts <= '09/11/21 00:00:00'
ORDER BY txn_ts DESC
LIMIT 1;
We pick up the price from the latest transaction either before or exactly on
09/11/2021 00:00:00
Things become tricky when we want to see the prices on a given day: 09/11/2021
--
-- UNION better here? See EXPLAIN in fiddle - not important for this debate!
--
WITH cte1 AS
(
SELECT txn_id AS t_start
FROM ts_portfolio
WHERE company = 'DB'
AND txn_ts <= '09/11/21 00:00:00'
ORDER BY txn_ts DESC
LIMIT 1
), cte2 AS
(
SELECT txn_id AS t_end
FROM ts_portfolio
WHERE company = 'DB'
AND txn_ts <= '09/11/21 23:59:59.999999'
ORDER BY txn_ts DESC
LIMIT 1
)
SELECT
t.txn_id, t.company, t.price
FROM
ts_portfolio t
WHERE txn_id BETWEEN (SELECT t_start FROM cte1) AND (SELECT t_end FROM cte2);
txn_id company price
2345 DB 20
3456 DB 30
which is correct - from 08/11 midday till 09/11 midday, the price was 20 and
then at midday 09/11 it rose to 30!
Becoming complex.
However, a Temporal (versioned) table would only require one extra field:
txn_t_from TIMESTAMP NOT NULL,
txn_t_to TIMESTAMP NOT NULL -- extra field - see fiddle for data inserts
and updates!
Each INSERT (behind the scenes) is an INSERT and an UPDATE.
INSERT INTO tdb_portfolio VALUES
(1000, 'DB', 5, '01/01/1900', '07/11/21 00:12:00'), -- inserted in the
past, updated 07/11 @ 12:00
(1234, 'DB', 10, '07/11/21 00:12:00', '08/11/21 00:12:00'), -- record inserted
on 07/11/21 - updated 08/11 @ 12:00
(2345, 'DB', 20, '08/11/21 00:12:00', '09/11/21 00:12:00'), -- record inserted
on 08/11/21 - and so on...
(3456, 'DB', 30, '09/11/21 00:12:00', '10/11/21 00:12:00'), -- record inserted
on 09/11/21
(4567, 'DB', 40, '10/11/21 00:12:00', '11/11/21 00:12:00'), -- record inserted
on 10/11/21
(5678, 'DB', 50, '11/11/21 00:12:00', '19/01/2038'); -- record inserted on
11/11/21 - never updated (effectively INFINITY until next update)
The queries become simpler:
SELECT txn_id, company, price
FROM tdb_portfolio AS OF '09/11/2021 00:00:00';
and the complex one above collapses to:
SELECT txn_id, company, price FROM tdb_portfolio
FOR SYSTEM_TIME BETWEEN ('09/11/2021 00:00:00' AND '09/11/2021
23:59:59.999999');
or maybe if there was a DATE(SYSTEM_TIME) function, ... FOR DATE(SYSTEM_TIME) =
'09/11/2021';
I'm assuming that (when properly introduced into PostgreSQL), there will be
optimisations for these sorts of query.
Do temporal tables bring anything else "to the party" - the augmented
functionality is a nice-to-have, but hardly earth-shattering?
Have I missed out on anything important?
Thx, SQLP!