"Isaac Raway" <[EMAIL PROTECTED]> wrote:
>
> At any rate, anyone have experience syncing SQLite DB files?
>
I have done this on two separate projects.
In the first case, the databases to be synced all had a fixed
set of records (a few hundred thousand rows). New rows were
never added or deleted. The only changes to the database were
value changes in individual columns of certain tables. To do
synchronization in this case, I timestamped every change. When
the two database where brought together to synchronize, I just
choose the most recent change for each record. (Complications
arise when the same record changes in different ways in the
two database during the same second. But we will gloss over
those details here.)
In the second case, values could be changed as in the first case,
but in addition, new records could be added. For that project,
I set the RowID for each newly created row equal to a 64-bit
random number from a good source of randomness().(The random()
function built into SQLite works for this - no coincidence.)
Then to synchronize, I just take the union of both databases,
and in cases where both databases contain the same record, I
take the most recently changed version.
If the disconnected applications both insert a row and happen
to choose the same random RowID, you happen get a collision.
In my particular application, new rows were sufficiently
infrequent and the consequences of a collision where
sufficiently mild that using 64-bits of randomness was
sufficient for me not to have to worry about this. If you
are doing many, many inserts or if you really need to guarantee
that collisions will not occur, then you need to go to 128
or more bits of randomness for the record ids. Of course,
there is always the chance of a collision, but if you use
good randomness and a large enough unique ID, the probability
of a collision can be made vanishingly small - much less than
the probability of errors from other sources such as, for
example, a giant meteor striking earth and wiping out
civilization.
Some people are upset by the concept of computer software
that works with "high probability". They want to be able
to prove that a failure "never" occurs. To this idea, I have
three rebuttals:
(1) For all practical purposes, "one failure every
10 billion years" is the same as "never fails".
(2) Nothing on a computer is certain. When you add
1 and 1 on a computer you get 2 with high probability.
But if you believe in quantum physics (and the people
who designed your CPU likely do) then you will
recognize that there is a non-zero probability of
getting a different result. So having a non-zero
failure probability is not a new thing in computer
programming - it is just something that is not
acknowledged very often.
(3) The chance of failure because of random id collision
is kazillions of times smaller than the chance of failure
because you have a logic error somewhere else in your
code.
Returning to the topic at hand: Deletes are still a problem.
If when you delete a row you leave nothing behind, then when
syncing to databases A and B you do not know if the row was
added to A or deleted from B. To work around this, I do not
actually delete rows - I just mark them as "unused".
--
D. Richard Hipp <[EMAIL PROTECTED]>
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------