Re: How to manage multiple versions of data?

Tauren Mills Tue, 30 Oct 2007 15:48:01 -0800

Thanks for the ideas everyone.  I do like Igor's suggestion of reusing
the same tables and having a "live" flag, as I'd hate to have an
entire extra set of duplicate tables.  However, since this approach
duplicates data, there are still some issues that I'm not sure how to
best deal with.

Let me explain the use case in more detail.  The users of the system
are referees of soccer games.  They need to be able to view their
schedule of matches.  A match includes a game number, date, time, home
team, away team, location, a center referee, and two linesmen.  The
users do not modify any of the main data elements associated with a
match.  But they can accept or reject matches, and that acceptance
information is associated with a match and the referee.

Administrators will be scheduling all of the matches, assigning
referees to matches, etc.  There could be several administrators.
There are many changes and updates that happen on a regular basis to
already published schedules.  Sometimes entire fields are closed due
to bad conditions, and all of the matches have to be moved elsewhere.
 The updating of data can take hours or days depending upon if the
admin is waiting on information.

Updated data should not be released one match at a time, but all
changes should go live at once.  There are many business reasons for
this that and I don't want to try to explain them all.  They mainly
involve publishing incomplete data, schedule reviews, and approvals
prior to publishing.  Creating and modifying these schedules can take
a lot of trial and error, moving matches around and trying to get
everything to work well for all parties involved.  It wouldn't be good
if referees looked at their schedules when a match was in a state of
flux and then didn't check the schedule again. Lastly, when a batch of
changes is completed, referees could be notified of the changes all at
once, not one match change at a time.

So, given this scenario, a live record Match #100 could have
CenterReferee accept the game, Linesman1 reject it, and Linesman2 not
yet respond.  Meanwhile, if there is a non-live duplicate of Match
#100, then that acceptance information would need to be associated
with it as well.  But the acceptance information could change after
the non-live record is created.

I suppose when a non-live match is saved that already has a live
version (thus making it the new live record), the attached information
to the existing live record could be copied to the non-live record,
then deleted, and the non-live record becomes live.  But to me, this
just seems hacky and prone to fail.  Kind of like data synchronization
systems always seem to get screwed up at times.  Maybe it is the
only/best/simplest solution though. I also need to think about how to
purge old edit data that isn't saved.

I think that I could make due with only a live and edit versions of
the data (just 2 rows of the same data at a time).  I don't think I
need to keep track of multiple versions.  It would be nice to be able
to support multiple "batch edit sessions" that are each distinct from
the others.  But that would get *really* complex...

Although, it would be nice if the admin could save just a certain set
of matches (say all 1st Division games, but leave all 2nd Division
games in edit mode).  That way, the admin could work on everything,
and publish sets of changes as he completes them instead of waiting
until all changes are done.  This should still be doable with just a
live and edit version of the data records.

After writing this, I have a much better sense of how to do it.  But
I'd still like to hear any other ideas.  Are there some patterns or
solutions that I should look into that could help with this?  I don't
think long transactions are the way to go for this.  Database
partitioning (which doesn't really solve this problem but is a great
idea for large datasets) isn't needed yet, as the amount of data will
be manageable.

Thanks again!
Tauren

On 10/30/07, Eelco Hillenius <[EMAIL PROTECTED]> wrote:
> > using long transactions would be pretty horrible :(
> > long transactions should be avoided like the plague at all cost.
> > Especially with certain databases (like Microsofts or Sybase variants)
>
> I never really used them myself, so that's entirely possible :-)
>
> > i would do it how igors proposes. Have multiply rows for the same data
> > with 1 row that is live and another row (with maybe the editting userid as
> > data so that you know what to get the next time)
>
> Yeah, that's a workable idea. The problem then is cleaning up the
> entries that never were finished, and just the fact that you're
> polluting your data model with administrative data. But it might be
> the best solution.
>
> Eelco
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to manage multiple versions of data?

Reply via email to