Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-04 Thread Jay A. Kreibich
On Wed, Apr 04, 2012 at 11:15:10AM +1000, Webdude scratched on the wall: But the same SQLite version, using the same schema, setup with the same PRAGMA's, creating a db with the same data and in the same order, and despite hardware / HDD / OS, should still produce the same file byte-for byte

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-04 Thread Jay A. Kreibich
On Wed, Apr 04, 2012 at 11:27:41AM +1000, Webdude scratched on the wall: But if data was added exactly in the same way/order shouldn't the counters all count to the same end result if the process was repeated at a later time on a another machine? In theory, yes, but that's a very thin line

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-04 Thread Black, Michael (IS)
: Wednesday, April 04, 2012 7:26 AM To: General Discussion of SQLite Database Subject: EXT :Re: [sqlite] Hashing 2 SQLite db files with the same data On Wed, Apr 04, 2012 at 11:15:10AM +1000, Webdude scratched on the wall: But the same SQLite version, using the same schema, setup with the same PRAGMA's

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-04 Thread Simon Slavin
On 4 Apr 2012, at 1:30pm, Black, Michael (IS) michael.bla...@ngc.com wrote: Howeverthe DB file is portable across big/little endian and 32/64 bit. So do your hash on the DB file and distribute that. Any reason you can't do that? My understanding is that he's having two different

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-04 Thread Jay A. Kreibich
. -j From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of Jay A. Kreibich [j...@kreibi.ch] Sent: Wednesday, April 04, 2012 7:26 AM To: General Discussion of SQLite Database Subject: EXT :Re: [sqlite] Hashing 2 SQLite db files

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-03 Thread Black, Michael (IS)
Database files are purportedly platform independent. So why don't you distribute the database file instead of building it? Then your checksum would be fine. Michael D. Black Senior Scientist Advanced Analytics Directorate Advanced GEOINT Solutions Operating Unit Northrop Grumman

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-03 Thread Jay A. Kreibich
On Tue, Apr 03, 2012 at 01:22:02AM +0100, Simon Slavin scratched on the wall: On 3 Apr 2012, at 12:27am, Webdude webd...@thewebdudes.com wrote: Does anyone know if SQLite stores additional unique internal information such as timestamps etc. that would affect this, and if so could these

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-03 Thread Webdude
Hi Jean-Christophe thanks for your help. Instead of trying to compare the hashes of DB files themselves, you appear to want a strict comparison of sets in the contents of the DBs. No, I physically need the end resulting file to hash to the same value. The file becomes a new identity in

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-03 Thread Simon Slavin
On 4 Apr 2012, at 2:15am, Webdude webd...@thewebdudes.com wrote: But the same SQLite version, using the same schema, setup with the same PRAGMA's, creating a db with the same data and in the same order, and despite hardware / HDD / OS, should still produce the same file byte-for byte ? And

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-03 Thread Webdude
Hi Jay, thanks for your help, /Does anyone know if SQLite stores additional unique internal //information such as timestamps etc. that would affect this, and //if so could these additional to the data variable features be //disabled in any way? // // SQLite files do contain

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-03 Thread Nico Williams
On Tue, Apr 3, 2012 at 8:27 PM, Webdude webd...@thewebdudes.com wrote: But if data was added exactly in the same way/order shouldn't the counters all count to the same end result if the process was repeated at a later time on a another machine? Well, why not... try it? :)

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-03 Thread Jean-Christophe Deschamps
But if data was added exactly in the same way/order shouldn't the counters all count to the same end result if the process was repeated at a later time on a another machine? Maybe, maybe not. Since the file format specifies meaningful fields only (my guess) it's quite possible that the

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-03 Thread Webdude
Hi again peeps, thanks for all your help. Seems there are many variables that could restrict doing this reliably. As several of you have mentioned, I should really rethink my design before this simple idea becomes far more complex than it needs to be. Cheers, David.

[sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Webdude
Hi, I am building a file comparison tool that is free as in beer and speech. The program allows people to put certain things in an SQLite database file, then an MD5 or SHA hash is run on the resulting file for identity of the total package contents. A key part of the design requires that if

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Nico Williams
On Mon, Apr 2, 2012 at 6:27 PM, Webdude webd...@thewebdudes.com wrote: I am building a file comparison tool that is free as in beer and speech. The program allows people to put certain things in an SQLite database file, then an MD5 or SHA hash is run on the resulting file for identity of the

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Simon Slavin
On 3 Apr 2012, at 12:27am, Webdude webd...@thewebdudes.com wrote: A key part of the design requires that if another user who is using the same program, (and probably would have to be using the same version of the SQLite engine I suspect), if they put exactly the same items into their

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Nico Williams
Also, if you were to use the running XOR of hashes method you'd also have to not make use of auto-allocated row IDs or any INTEGER PRIMARY KEYs, or AUTOINCREMENTed columns, or to not include any of those in the hashes, which probably also means not using any of those in FOREIGN KEYs. That's...

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Webdude
Hi Nico, thanks for the reply. You can't rely on two SQLite3 DBs with the same contents being equal files. The sequences of INSERT/UPDATE/DELETE statements that created the two files with the same contents can differ and thus result in different b-tree layouts. It's not important that

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Webdude
Hi Simon, thanks for helping me with this. Inserting the same data in the same order on the same platform with the same (PRAGMA) settings would result in the files matching identically. Do you feel that the platform - Hardware / OS / some other factor could influence the way SQLite

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Igor Tandetnik
On 4/2/2012 9:37 PM, Webdude wrote: It's not important that the 2 db files are exactly the same all the time that people are editing them, but only when they 'finalise' a 'package'. So what if some code in the 'packaging' process performed a sequence of queries that read all the data from the

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Nico Williams
On Mon, Apr 2, 2012 at 8:37 PM, Webdude webd...@thewebdudes.com wrote: It's not important that the 2 db files are exactly the same all the time that people are editing them, but only when they 'finalise' a 'package'. So what if some code in the 'packaging' process performed a sequence of

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Webdude
/ It's not important that the 2 db files are exactly the same all the time // that people are editing them, but only when they 'finalise' a 'package'. // So what if some code in the 'packaging' process performed a sequence of // queries that read all the data from the db, table by table, and

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Jean-Christophe Deschamps
Do you feel that the platform - Hardware / OS / some other factor could influence the way SQLite performed its sequence? Instead of trying to compare the hashes of DB files themselves, you appear to want a strict comparison of sets in the contents of the DBs. For instance, changing the

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Simon Slavin
On 3 Apr 2012, at 2:42am, Webdude webd...@thewebdudes.com wrote: Hi Simon, thanks for helping me with this. Inserting the same data in the same order on the same platform with the same (PRAGMA) settings would result in the files matching identically. Do you feel that the platform -

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Webdude
/ Inserting the same data in the same order on the same platform // with the same (PRAGMA) settings would result in the files // matching identically. // // Do you feel that the platform - Hardware / OS / some other factor could influence the way SQLite performed its sequence? / SQLite

Re: [sqlite] Hashing 2 SQLite db files with the same data

2012-04-02 Thread Nico Williams
On Mon, Apr 2, 2012 at 11:39 PM, Webdude webd...@thewebdudes.com wrote: I'm sure everyone thinks I'm mad, but I still haven't seen proof of Can't be done. The question is not can this be done but should it be done this way. Relying on a sequence of SQL statements yielding exactly the same DB