[sqlite] Creating a stable database

2015-09-28 Thread Aaron Digulla

Am Montag, 28. September 2015 15:55 CEST, "R.Smith"  
schrieb: 

> 
> 
> On 2015-09-28 11:30 AM, Aaron Digulla wrote:
> >   
> > Am Freitag, 25. September 2015 20:46 CEST, Warren Young  > etr-usa.com> schrieb:
> >   
> >>
> >> That suggests a good test for Aaron, then:
> >>
> >>  sqlite3 my.db .dump > dump.sql
> >>  sqlite3 a.db < dump.sql
> >>  sqlite3 b.db < dump.sql
> > Thank you very much. This set of commands has allowed me to narrow down the 
> > root cause to the Java JDBC driver by xerial 
> > (https://github.com/xerial/sqlite-jdbc):
> >
> > When I dump the database and recreate it two times with the command line 
> > sqlite3 tool, I get the exact same files. Looking at the source code of the 
> > JDBC driver, I can't see anything obviously wrong, either. There is one 
> > malloc() where it doesn't clear the memory but
> > I'm not creating functions, so the code is never used.
> >
> > The other malloc() when I read metadata from the database.
> >
> > That leaves me with a bug in my code or some really odd behavior because of 
> > the Java VM.
> 
> Also note that if the Java VM builds the DB in any way other than a 

> straight dump-the-sql-to-SQLite-Execute method, it might for instance 
> add comments to the Table definitions, often if this is done a date / 
> time is included, which by itself will cause different byte images for 
> the DB file.

I wrote the client side of the code (which executes CREATE TABLE and INSERT 
statements); there is nothing there.

Also, the JDBC driver is only a thin wrapper around the C API, so I don't 
expect oddities there. But I think I found it, now:

There is code which optimizes imports from CSV files by remembering the 
timestamp, size, etc. The API was a bit stubborn, which means I couldn't 
disable this code - instead I fed the code fake input files (table name = file 
name, size=0, timestamp=now) ... The result was a series of small changes in 
several places (= a new place every time the last page for the timestamp table 
was full).

sqlite works :-)

Regards,

-- 
Aaron "Optimizer" Digulla a.k.a. Philmann Dark
"It's not the universe that's limited, it's our imagination.
Follow me and I'll show you something beyond the limits." 
http://blog.pdark.de/ 



[sqlite] Creating a stable database

2015-09-28 Thread Aaron Digulla

Am Freitag, 25. September 2015 20:46 CEST, Warren Young  
schrieb: 

> On Sep 25, 2015, at 11:59 AM, Richard Hipp  wrote:
> > 
> > On 9/25/15, Aaron Digulla  wrote:
> >> 
> >> I was wondering if it was possible to create the same database (= no binary
> >> difference) twice with sqlite.
> > 
> > It works fine to create identical database files when I try it:
> > 
> >   drh at bella:~/sqlite/bld$ ./sqlite3 db1  >   drh at bella:~/sqlite/bld$ ./sqlite3 db2  
> That suggests a good test for Aaron, then:
> 
> sqlite3 my.db .dump > dump.sql
> sqlite3 a.db < dump.sql
> sqlite3 b.db < dump.sql

Thank you very much. This set of commands has allowed me to narrow down the 
root cause to the Java JDBC driver by xerial 
(https://github.com/xerial/sqlite-jdbc):

When I dump the database and recreate it two times with the command line 
sqlite3 tool, I get the exact same files. Looking at the source code of the 
JDBC driver, I can't see anything obviously wrong, either. There is one 
malloc() where it doesn't clear the memory but 
I'm not creating functions, so the code is never used.

The other malloc() when I read metadata from the database.

That leaves me with a bug in my code or some really odd behavior because of the 
Java VM.

Regards, 

-- 
Aaron "Optimizer" Digulla a.k.a. Philmann Dark
"It's not the universe that's limited, it's our imagination.
Follow me and I'll show you something beyond the limits." 
http://blog.pdark.de/ 



[sqlite] Creating a stable database

2015-09-25 Thread Aaron Digulla
Hello,

I was wondering if it was possible to create the same database (= no binary 
difference) twice with sqlite.

I tried by creating all tables in the same order and then inserting all rows in 
order of the primary key. But if I create two database files with the same 
code, the files are different (I checked with cmp(1) on Linux).

Is there a way to create database files which only change when the data inside 
changes?

I'm using sqlite 3.8.10.1

-- 
Aaron "Optimizer" Digulla a.k.a. Philmann Dark
"It's not the universe that's limited, it's our imagination.
Follow me and I'll show you something beyond the limits." 
http://blog.pdark.de/