On Thu, 2014-07-17 at 21:43 +0100, Simon Slavin wrote:
On 17 Jul 2014, at 8:43pm, Tristan Van Berkom <[email protected]> wrote:

> The objective is to keep a revisioned history of 'E' whenever 'E' has > changed, or any of it's 'P' counterparts have changed, ideally without > storing a duplicate row for the one 'E' and every one of its 'P's every
> time anything changes.

I assume that an event is created knowing what participants it has, and that these participants never change over the lifetime of the events.

Most of your design criteria depend on knowing these two things:

(A) Are you making few changes and doing lots of SELECTing, or making lots and lots of changes and only occasionally needing to look things up ?

(B) When you do SELECTing, do you mostly need to know about the data as it is right now, or are most SELECTs about previous versions and not the current version ?



Firstly, participants can be added and removed from the event, to solve
the issue of keeping track of which participants are valid for a given
revision, we have the 'P(hi)' (participant history index as I named it)
which keeps a row indicating the revision of each participant for each
master revision...

In this way, if a participant has been added or removed, then the
history for a given master revision will still indicate which
participants were included in *that* revision of the event.

As for the details (A) and (B), we know that SELECTing needs to be fast
for the 'current revision' but will seldom (or never) occur in the
history... for the history we are only concerned with listing the
previous record states (which does not particularly need to be
lightning fast).

Modifications will not happen frequently in relation to the amount
of events in the system (i.e. modifications will mostly occur on the
same day of the event, while we may store years and years worth of
events in the system), however modifications will be made on a per
participant basis (hence our desire to revision the participants
separately and avoid needless data duplication).


Once you know those two things you can decide whether most of the work should be done when a change occurs, or whether you should make noting a change as easy as possibly, and do all the hard work when you need to look things up.

You will also know whether you need just a copy of each row with the change-timestamp, or whether it's useful to keep another copy with just 'data as it is now' in it for fast reference when you need current data.

> SELECT * FROM 'E(h)'

Just a warning that you can't use single quotes as identifiers in SQLite. Nor can you have the name of a table as a variable.


Yes, I have a habit of not conforming exactly to syntax while
illustrating an example on a mailing list... the specific syntax was not
the point of the example ;-)


Apart from that, you seem to have figured out all the usable strategies, you just need to know how the data will be used to pick between them.

Simon.


Thank you Simon and Pavlos for having responded to this post, this was a
design choice which is cheap to make now, but painfully expensive to
change later down the road, so thank you for sharing your insights.

Best Regards,
 -Tristan

PS: Sorry for messing up the thread, I'm having some technical difficulty with my mail client, causing me to reply from a separate email client.





_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to