On Fri, Jul 22, 2011 at 4:22 PM, Simon Slavin <[email protected]> wrote:
>
> On 22 Jul 2011, at 10:06pm, Kent Tenney wrote:
>
>> I need to monitor changes in an an sqlite3 file an application uses.
>>
>> I make copies periodically, so I have 2 files
>> shotwell_2011-07-21.db and shotwell_2011-07-22.db
>>
>> For example, I'm interested in PhotoTable, which had rows added,
>> deleted and changed between those days.
>>
>> Figuring out additions seems straightforward, there is an 'id' column.
>>
>> I would like advice on ways to determine changes and deletions.
>> I'm currently accessing the files via Python's sqlite3 library, though
>> plan to start using SqlAlchemy.
>>
>> I expect I'm not the first to have interest in this, but have been
>> unable to locate any discussion of it.
>
> You're effectively asking how to sych two copies of the same database.  
> That's something I used to write about a lot.  It's impossible unless you add 
> more data.

:-[

I can't add data, the db belongs to another app, I just watch.

>
> You first do a LEFT JOIN and a RIGHT JOIN to find rows which were added and 
> rows which were deleted.  No real problem, just two SELECTs on the 'rowid' 
> column every TABLE must have.  But how do you find rows which were updated ?  
> You have to write a SELECT that compares every field, which means either some 
> custom logic for each TABLE, or a routine that finds out all the column names 
> returned when you SELECT * then turns that into a SELECT that compares each 
> column of each row.  Messy.

I'm expecting a bit of mess.
So far, I'm considering 2 approaches

- dump to sql and diff the files. Initial tests seem to show some
unexpected results
- generate a "rowhash" for each record, hash a concat of all values in a row.

I don't care about any of the particulars you describe, only:

updated = []
for row in rows:
 if row[newer] != row[older]:
   updated.append(row)



  And even then, do you care if a row was changed twice ?  If the same
field was changed twice do you need to know the 'middle' value ?  If
two fields were changed do you need to know which order the changes
were made ?  If a row was changed then deleted do you need to know the
change ?
>
> The most detailed approach is journaling.  Write your own routine that you 
> call whenever you do DELETE or UPDATE.  It does what you want but also writes 
> a timestamped entry to a 'log' TABLE.  To list all the changes, just read 
> whatever you noted in your 'log' TABLE.  This gives you as much data as you 
> care to log.  It might be simplest to log the SQL command itself.
>
> A less detailed approach is to add a 'lastChange' column to every table.  
> Whenever you INSERT or UPDATE, make sure you update this column too.  You use 
> a SELECT ... JOINs to find deletions, but you can SELECT on the lastChange 
> column to find insertions and updates.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to