Re: [sqlite] Synchronising multiple copies of a database

2009-10-08 Thread John Elrick
Jean-Denis Muys wrote:
> On 10/7/09 21:35 , "Adam DeVita"  wrote:
>
>   
>> One can  also get a mess if Mr. Red and Mr Black both get new customers, and
>> enter them and they both get the same ID because the auto-generated int
>> happens to be the same. Both copies get updated with the other guy's data,
>> they then get annoyed and enter the stuff again and it happens over again,
>> but now there are N entries of the other guy's customer  in the database
>> depending on how many times they do it.
>>
>> 
>
> I solve this problem the following way:
>
> New records ID is not set by the local client who creates the new record,
> but by the central database.

Another solution is to use a GUID instead of an integer.


John
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Synchronising multiple copies of a database

2009-10-08 Thread Jean-Denis Muys
On 10/7/09 21:35 , "Adam DeVita"  wrote:

> One can  also get a mess if Mr. Red and Mr Black both get new customers, and
> enter them and they both get the same ID because the auto-generated int
> happens to be the same. Both copies get updated with the other guy's data,
> they then get annoyed and enter the stuff again and it happens over again,
> but now there are N entries of the other guy's customer  in the database
> depending on how many times they do it.
> 

I solve this problem the following way:

New records ID is not set by the local client who creates the new record,
but by the central database.

Upon new record creation a "trigger" is run that sends the new record to the
central database. The central database then allocates the new id (and
revision stamp) to the new record and returns them to the trigger which
inserts them back into to local repository.

This only works in an "online" scenario clearly.

An alternative is to guarantee uniqueness across clients by making sure the
newly allocated id is a composite that includes the creator client identity.

Jean-Denis

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Synchronising multiple copies of a database

2009-10-07 Thread John Elrick
Simon Slavin wrote:
> On 7 Oct 2009, at 8:33pm, John Elrick wrote:
>
>   
>> Isn't this a variation of the DVCS problem?   In other words, would it
>> be correct in saying that the underlying issue is treating this as a
>> database problem, rather than it being a versioning problem which
>> happens to involve a database?
>> 
>
> Yes yes yes.  And look how complicated the programming is to do DVCS  
> correctly.
>   

I would think the programming is relatively easy...the hard part is 
getting the rules right and being confident you've covered all the edge 
cases.

> And now the problem is ... What is the resolution of your  
> versioning ?  Do put a version number on each row, or do you consider  
> each column in each row to need its own version number ?  What if one  
> site changes one column and another changes another column of the same  
> row ?  How many version numbers do you want to store for just one  
> table ?
>   

As many as necessary, but no more


John
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Synchronising multiple copies of a database

2009-10-07 Thread Simon Slavin

On 7 Oct 2009, at 8:33pm, John Elrick wrote:

> Isn't this a variation of the DVCS problem?   In other words, would it
> be correct in saying that the underlying issue is treating this as a
> database problem, rather than it being a versioning problem which
> happens to involve a database?

Yes yes yes.  And look how complicated the programming is to do DVCS  
correctly.

And now the problem is ... What is the resolution of your  
versioning ?  Do put a version number on each row, or do you consider  
each column in each row to need its own version number ?  What if one  
site changes one column and another changes another column of the same  
row ?  How many version numbers do you want to store for just one  
table ?

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Synchronising multiple copies of a database

2009-10-07 Thread Adam DeVita
"You have to be really careful"
Absolutely.  Even if  you know the order of updates (which I do). If site A
updates an off line record in a cached copy after site B deletes it other
sites can receive the change records in order and have the record re-appear
(via insert or replace).

One can  also get a mess if Mr. Red and Mr Black both get new customers, and
enter them and they both get the same ID because the auto-generated int
happens to be the same. Both copies get updated with the other guy's data,
they then get annoyed and enter the stuff again and it happens over again,
but now there are N entries of the other guy's customer  in the database
depending on how many times they do it.





On Wed, Oct 7, 2009 at 3:18 PM, Simon Slavin wrote:

>
> On 7 Oct 2009, at 7:20pm, Adam DeVita wrote:
>
> > regarding this
> > " The fault is that
> > almost nobody does it right: they neglect to keep an 'unaltered
> > central copy' and think they can cross-apply journals each time two
> > databases talk to one-another.  That does not work for various
> > reasons."
> >
> > Would a central repository of journals that can be applied to local
> > repositories be sufficient?  I suppose I assume that running the same
> > program on N workstations with the same set of journals should
> > produce N
> > identical results.
>
> You need a copy of the database which is not changed by any site.  All
> the sites send in their journals.  The journals are merged into a
> superjournal in time order.  The superjournal is then applied to the
> central copy of the database.  Then the updated database is sent back
> out to all sites.
>
> The problem comes when you apply multiple journals in a different
> order. Start with each site with identical copies of a TABLE with
> three clients: one managed by Mr. Green, one by Mr. Red, and one by
> Mr. Black.  'G R B'.  Then, in this order ...
>
> Mr. Green goes on holiday ...
> Site A says that all Mr. Green's customers will be handled by Mr. Red.
> UPDATE clients SET contact = 'red' WHERE contact = 'green'
>
> Mr. Red goes on holiday ...
> Site B says that all Mr. Red's customers will be handled by Mr. Black.
>
> Then Mr. Green comes back from holiday, and Mr. Black goes on holiday
> so ...
> Site C says that all Mr. Black's customers will be handled by Mr. Green.
>
> Then they all synchronise databases.  See if you can make them all end
> up with the same data if they synch against each-other rather than a
> central unaltered copy of the databases.  Doesn't work: one site might
> have 'B B B', another 'R B R'.  You can do it only by luck ... by
> happening to know in which order people went on holiday.  However, if
> you always synch against a central unaltered copy of the database you
> can synch in any order.  Once everyone has synchronised you distribute
> a copy of the central database to everyone and they all have identical
> data once more.  That's the simplest setup.  You can get more
> complicated by having each site remember which journals they've played
> back.
>
> The problem does not occur if any record can only ever be modified by
> one site.  But if you have the normal 'anyone can do anything' setup,
> you have to be really really careful.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
VerifEye Technologies Inc.
905-948-0015x245
7100 Warden Ave, Unit 3
Markham ON, L3R 8B5
Canada
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Synchronising multiple copies of a database

2009-10-07 Thread John Elrick
Simon Slavin wrote:
> On 7 Oct 2009, at 7:20pm, Adam DeVita wrote:
>
>   
>> regarding this
>> " The fault is that
>> almost nobody does it right: they neglect to keep an 'unaltered
>> central copy' and think they can cross-apply journals each time two
>> databases talk to one-another.  That does not work for various  
>> reasons."
>>
>> Would a central repository of journals that can be applied to local
>> repositories be sufficient?  I suppose I assume that running the same
>> program on N workstations with the same set of journals should  
>> produce N
>> identical results.
>> 
>
>   
SNIP
> The problem comes when you apply multiple journals in a different  
> order. Start with each site with identical copies of a TABLE with  
> three clients: one managed by Mr. Green, one by Mr. Red, and one by  
> Mr. Black.  'G R B'.  Then, in this order ...
>
> Mr. Green goes on holiday ...
> Site A says that all Mr. Green's customers will be handled by Mr. Red.
> UPDATE clients SET contact = 'red' WHERE contact = 'green'
>
> Mr. Red goes on holiday ...
> Site B says that all Mr. Red's customers will be handled by Mr. Black.
>
> Then Mr. Green comes back from holiday, and Mr. Black goes on holiday  
> so ...
> Site C says that all Mr. Black's customers will be handled by Mr. Green.
>
> Then they all synchronise databases.  See if you can make them all end  
> up with the same data if they synch against each-other rather than a  
> central unaltered copy of the databases.  Doesn't work: one site might  
> have 'B B B', another 'R B R'.  You can do it only by luck ... by  
> happening to know in which order people went on holiday.  However, if  
> you always synch against a central unaltered copy of the database you  
> can synch in any order.  Once everyone has synchronised you distribute  
> a copy of the central database to everyone and they all have identical  
> data once more.  That's the simplest setup.  You can get more  
> complicated by having each site remember which journals they've played  
> back.
>   

Simon,

Isn't this a variation of the DVCS problem?   In other words, would it 
be correct in saying that the underlying issue is treating this as a 
database problem, rather than it being a versioning problem which 
happens to involve a database?

I ask because there are two separate projects which involve this sort of 
issue which I have simply deferred for the time being.


John
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Synchronising multiple copies of a database

2009-10-07 Thread Simon Slavin

On 7 Oct 2009, at 7:20pm, Adam DeVita wrote:

> regarding this
> " The fault is that
> almost nobody does it right: they neglect to keep an 'unaltered
> central copy' and think they can cross-apply journals each time two
> databases talk to one-another.  That does not work for various  
> reasons."
>
> Would a central repository of journals that can be applied to local
> repositories be sufficient?  I suppose I assume that running the same
> program on N workstations with the same set of journals should  
> produce N
> identical results.

You need a copy of the database which is not changed by any site.  All  
the sites send in their journals.  The journals are merged into a  
superjournal in time order.  The superjournal is then applied to the  
central copy of the database.  Then the updated database is sent back  
out to all sites.

The problem comes when you apply multiple journals in a different  
order. Start with each site with identical copies of a TABLE with  
three clients: one managed by Mr. Green, one by Mr. Red, and one by  
Mr. Black.  'G R B'.  Then, in this order ...

Mr. Green goes on holiday ...
Site A says that all Mr. Green's customers will be handled by Mr. Red.
UPDATE clients SET contact = 'red' WHERE contact = 'green'

Mr. Red goes on holiday ...
Site B says that all Mr. Red's customers will be handled by Mr. Black.

Then Mr. Green comes back from holiday, and Mr. Black goes on holiday  
so ...
Site C says that all Mr. Black's customers will be handled by Mr. Green.

Then they all synchronise databases.  See if you can make them all end  
up with the same data if they synch against each-other rather than a  
central unaltered copy of the databases.  Doesn't work: one site might  
have 'B B B', another 'R B R'.  You can do it only by luck ... by  
happening to know in which order people went on holiday.  However, if  
you always synch against a central unaltered copy of the database you  
can synch in any order.  Once everyone has synchronised you distribute  
a copy of the central database to everyone and they all have identical  
data once more.  That's the simplest setup.  You can get more  
complicated by having each site remember which journals they've played  
back.

The problem does not occur if any record can only ever be modified by  
one site.  But if you have the normal 'anyone can do anything' setup,  
you have to be really really careful.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Synchronising multiple copies of a database

2009-10-07 Thread Adam DeVita
regarding this
" The fault is that
almost nobody does it right: they neglect to keep an 'unaltered
central copy' and think they can cross-apply journals each time two
databases talk to one-another.  That does not work for various reasons."

Would a central repository of journals that can be applied to local
repositories be sufficient?  I suppose I assume that running the same
program on N workstations with the same set of journals should produce N
identical results.



On Wed, Oct 7, 2009 at 12:16 PM, Simon Slavin
wrote:

>
> On 7 Oct 2009, at 1:47pm, Jean-Denis Muys wrote:
>
> > On 10/7/09 11:50 , "Simon Slavin"  wrote:
> >
> >> Try really really hard just to have all sites access your MySQL
> >> database remotely.
> >
> > Unfortunately this approach is not possible in the short term. The
> > client
> > applications are legacy applications, porting them to that scheme is
> > a major
> > undertaking. [snip]
>
> I completely understand.  The recommendation is valuable in the
> general case, but useless in yours.  Still, that's why they pay you
> the big bucks: to write the complicated program.
>
> >> Keep a journal.  Keep an unaltered central copy of the data.  As each
> >> site contacts the central site, play that sites journal back against
> >> the unaltered central copy.  The post-journal central copy of the
> >> database becomes the new copy for distribution.
> >
> > Interesting idea, that makes a lot of sense in the "offline" scenario.
>
> Standard solution to the synchronisation problem.  The fault is that
> almost nobody does it right: they neglect to keep an 'unaltered
> central copy' and think they can cross-apply journals each time two
> databases talk to one-another.  That does not work for various reasons.
>
> The synchronisation service built into Mac OS X (e.g. synchronising
> with online services or an iPhone/iPod) implements it in the correct
> manner.  It takes extra data space and fussy programming but it does
> at least work right !
>
> > [snip] In any case, any book reference on this topic?
>
> Since I joined this list and noticed repeated questions on the subject
> I have been trying hard to find any book with anything significant to
> say on the issue.  I failed: everything I found was lacking in some
> way.  Some were flat-out wrong.  I work at a university and I think
> I'm going to ask the Computing people to find me someone who knows
> this stuff.  I'm just paid to do it in real life, not read or write
> books about it.  If I find something good I'll read it and post here
> about it.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
VerifEye Technologies Inc.
905-948-0015x245
7100 Warden Ave, Unit 3
Markham ON, L3R 8B5
Canada
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Synchronising multiple copies of a database

2009-10-07 Thread Simon Slavin

On 7 Oct 2009, at 1:47pm, Jean-Denis Muys wrote:

> On 10/7/09 11:50 , "Simon Slavin"  wrote:
>
>> Try really really hard just to have all sites access your MySQL
>> database remotely.
>
> Unfortunately this approach is not possible in the short term. The  
> client
> applications are legacy applications, porting them to that scheme is  
> a major
> undertaking. [snip]

I completely understand.  The recommendation is valuable in the  
general case, but useless in yours.  Still, that's why they pay you  
the big bucks: to write the complicated program.

>> Keep a journal.  Keep an unaltered central copy of the data.  As each
>> site contacts the central site, play that sites journal back against
>> the unaltered central copy.  The post-journal central copy of the
>> database becomes the new copy for distribution.
>
> Interesting idea, that makes a lot of sense in the "offline" scenario.

Standard solution to the synchronisation problem.  The fault is that  
almost nobody does it right: they neglect to keep an 'unaltered  
central copy' and think they can cross-apply journals each time two  
databases talk to one-another.  That does not work for various reasons.

The synchronisation service built into Mac OS X (e.g. synchronising  
with online services or an iPhone/iPod) implements it in the correct  
manner.  It takes extra data space and fussy programming but it does  
at least work right !

> [snip] In any case, any book reference on this topic?

Since I joined this list and noticed repeated questions on the subject  
I have been trying hard to find any book with anything significant to  
say on the issue.  I failed: everything I found was lacking in some  
way.  Some were flat-out wrong.  I work at a university and I think  
I'm going to ask the Computing people to find me someone who knows  
this stuff.  I'm just paid to do it in real life, not read or write  
books about it.  If I find something good I'll read it and post here  
about it.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Synchronising multiple copies of a database

2009-10-07 Thread Jean-Denis Muys
On 10/7/09 11:50 , "Simon Slavin"  wrote:

> 
> On 7 Oct 2009, at 10:13am, Jean-Denis Muys wrote:
> 
>> [setup description]
>> 
>> I'd appreciate some feedback here or
>> pointers to litterature.
 
> Try really really hard just to have all sites access your MySQL
> database remotely.

Unfortunately this approach is not possible in the short term. The client
applications are legacy applications, porting them to that scheme is a major
undertaking. I need to insert my code within the legacy applications only in
a way that as little invasive as possible. This trigger/polling approach was
deemed a fair compromise. Overtime, new appls will probably be developped
that work directly against the central database, and the legacy apps will be
phased out progressively. Also those legacy apps also work when offline,
which is a strong incentive to keep them (and I know this opens a new kind
of can of worms). Finaly the central server is one or two stellar systems
away, with bad latency and throughput. The local data repository in a way
acts as a cache to keep the apps responsive.

> 
> Keep a journal.  Keep an unaltered central copy of the data.  As each
> site contacts the central site, play that sites journal back against
> the unaltered central copy.  The post-journal central copy of the
> database becomes the new copy for distribution.

Interesting idea, that makes a lot of sense in the "offline" scenario.

> [snip scary scenarios]
> 
> If your system deals with those, it's most of the way there.
> 

I'll have to handle those cases very carefuly. Clearly, when two clients
modify the global state in an incompatible way, some kind of conflict
resolution must happen (similar to what Version Control Systems do). The
proviso here is to make sure these conflicts are at least detected.

I need to think about all this some more. Thanks a lot for your very
valuable feedback.

In any case, any book reference on this topic?

Jean-Denis

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Synchronising multiple copies of a database

2009-10-07 Thread Simon Slavin

On 7 Oct 2009, at 10:13am, Jean-Denis Muys wrote:

> [setup description]
>
> I'd appreciate some feedback here or
> pointers to litterature.

You see how complicated you had to get before you had an acceptable  
solution ?  Two extra columns ?  Timestamps ?  Retention of records  
which have to be ignored ?  I see nothing obvious wrong, so here are  
some general comments.

Try really really hard just to have all sites access your MySQL  
database remotely.  MySQL is an excellent client/server version of  
SQL.  You could kill all the complexity you had to invent by just  
having everyone access your database live.  Not only is the  
programming simpler, but you can back everything up in one go, and  
everyone has completely up-to-date data.  If this solution cannot be  
implemented for you, try

Keep a journal.  Keep an unaltered central copy of the data.  As each  
site contacts the central site, play that sites journal back against  
the unaltered central copy.  The post-journal central copy of the  
database becomes the new copy for distribution.

If these just cannot be done, you have to implement your own  
solution.  Check that it handles these nasty scenarios for  
synchronisation:

One site deletes a record then another site modifies it.  e.g. Branch  
1 sez "Our supplier withdrew the product 'Flinns' and created a new  
one."; Branch 2 sez "Our supplier renamed their 'Flinns' product to be  
'Flanns' because 'Flinns' is a swearword in Hijinks.  Each site then  
continues selling the new product.  Headquarters considers these the  
same product, and wants sales figures added up, not considered separate.

Superseding UPDATEs.  Starting with three cars: black, red and green,  
one site resprays all black cars red and another site resprays all red  
cars green.  After synchronising, depending on which update you get  
first you may or may not end up with a red car.

Competing UPDATE and INSERT.  One site creates new product for  
category L.  Then Headquarters sends out a memo that category L is now  
category Q but it doesn't know about the new product yet.  The site's  
copy of the database gets accurate data, but since Headquarters got  
the UPDATE first, then the INSERT, it somehow still has a category L  
product.

If your system deals with those, it's most of the way there.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users