On Thu, 2014-07-17 at 21:43 +0100, Simon Slavin wrote:
On 17 Jul 2014, at 8:43pm, Tristan Van Berkom
<[email protected]> wrote:
> The objective is to keep a revisioned history of 'E' whenever 'E'
has
> changed, or any of it's 'P' counterparts have changed, ideally
without
> storing a duplicate row for the one 'E' and every one of its 'P's
every
> time anything changes.
I assume that an event is created knowing what participants it has,
and that these participants never change over the lifetime of the
events.
Most of your design criteria depend on knowing these two things:
(A) Are you making few changes and doing lots of SELECTing, or making
lots and lots of changes and only occasionally needing to look things
up ?
(B) When you do SELECTing, do you mostly need to know about the data
as it is right now, or are most SELECTs about previous versions and
not the current version ?
Firstly, participants can be added and removed from the event, to solve
the issue of keeping track of which participants are valid for a given
revision, we have the 'P(hi)' (participant history index as I named it)
which keeps a row indicating the revision of each participant for each
master revision...
In this way, if a participant has been added or removed, then the
history for a given master revision will still indicate which
participants were included in *that* revision of the event.
As for the details (A) and (B), we know that SELECTing needs to be fast
for the 'current revision' but will seldom (or never) occur in the
history... for the history we are only concerned with listing the
previous record states (which does not particularly need to be
lightning fast).
Modifications will not happen frequently in relation to the amount
of events in the system (i.e. modifications will mostly occur on the
same day of the event, while we may store years and years worth of
events in the system), however modifications will be made on a per
participant basis (hence our desire to revision the participants
separately and avoid needless data duplication).
Once you know those two things you can decide whether most of the
work should be done when a change occurs, or whether you should make
noting a change as easy as possibly, and do all the hard work when
you need to look things up.
You will also know whether you need just a copy of each row with the
change-timestamp, or whether it's useful to keep another copy with
just 'data as it is now' in it for fast reference when you need
current data.
> SELECT * FROM 'E(h)'
Just a warning that you can't use single quotes as identifiers in
SQLite. Nor can you have the name of a table as a variable.
Yes, I have a habit of not conforming exactly to syntax while
illustrating an example on a mailing list... the specific syntax was not
the point of the example ;-)
Apart from that, you seem to have figured out all the usable
strategies, you just need to know how the data will be used to pick
between them.
Simon.
Thank you Simon and Pavlos for having responded to this post, this was a
design choice which is cheap to make now, but painfully expensive to
change later down the road, so thank you for sharing your insights.
Best Regards,
-Tristan
PS: Sorry for messing up the thread, I'm having some technical
difficulty with my mail client, causing me to reply from a separate
email client.
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users