*D. R.* Hipp, Thank you so much for looking at the database file for me. I'll explain what I found out so if anyone else runs into the same problem it'll be more clear.
I am using a JDBC driver, and JDBC is supposed to be a standard API, but there are different versions. In many of the databases I've tried I have not had to close a preparedstatement after a batch commit. Using the xerial JDBC driver, I have to close it and re-prepare the statement after every batch commit or else something screwy goes on and more records are added than expected. My guess is that batch queries from the last commit were not removed properly and were done over again multiple times. The reason I did not initially check the number of records was because I had tested the same JDBC code against other databases and the record number was what was expected. This xerial JDBC driver was a little unusual, and does not seem to have the same implementation as the others. Thanks again for your help, Julian On Mon, Oct 27, 2008 at 9:55 AM, D. Richard Hipp <[EMAIL PROTECTED]> wrote: > > On Oct 26, 2008, at 10:01 PM, Julian Bui wrote: > > Hi Dr. Hipp, >> >> Thanks for responding (and without rudeness). >> >> I'm running my db from a java app within linux using the latest version of >> sqlite released and the latest version of the xerial jdbc driver. Again, I >> only index over a single column which holds a double. >> >> I start with a clean database and insert rows and find out how big the >> database is. >> >> ACTUAL RUNS: >> 100 records = 3.0 KB >> 1,000 records = 49 KB >> 10,000 records = 2.5 MB >> 100,000 records = 240 MB >> >> EXPECTED RESULTS - with 28 bytes per record estimate of actual data (not >> including index): >> 100 records = 100 * 28 = 2.73 KB (agrees with actual run) >> 1000 records = 1000 * 28 = 27.34 KB (slipping away from actual run size, >> but still within the same magnitude) >> 10,000 records = 10,000 * 28 = 0.267 MB (disagrees with actual result, is >> index size growing exponentially?) >> 100,000 records = 100,000 * 28 = 2.67 MB (disagrees with actual result, is >> index size growing exponentially?) >> >> I've attached the 100 and 10,000 record dbs. >> >> > > I ran the sqlite3_analyzer program against your db10000.db database file. > And despite its name, it appears to contain 55002 entries, not 10,000 as > claimed. The average size of each entry in the table is 26 bytes and each > index entry consumes about 13 bytes. 55001*(26+13) is about 2.1MB which is > not far from the actual database file size of 2.6MB. > > I think the growth in database size is probably occurring because you are > inserting more values into the database than you think you are. > > The first few pages of the output of sqlite3_analyzer are attached. > > D. Richard Hipp > [EMAIL PROTECTED] > > /** Disk-Space Utilization Report For db10000.db > *** As of 2008-Oct-27 12:45:37 > > Page size in bytes.................... 1024 > Pages in the whole file (measured).... 2597 > Pages in the whole file (calculated).. 2597 > Pages that store data................. 2597 100.0% > Pages on the freelist (per header).... 0 0.0% > Pages on the freelist (calculated).... 0 0.0% > Pages of auto-vacuum overhead......... 0 0.0% > Number of tables in the database...... 2 > Number of indices..................... 1 > Number of named indices............... 1 > Automatically generated indices....... 0 > Size of the file in bytes............. 2659328 > Bytes of user payload stored.......... 1374633 51.7% > > *** Page counts for all tables with their indices ******************** > > CAN_MESSAGES.......................... 2596 99.961% > SQLITE_MASTER......................... 1 0.039% > > *** All tables and indices ******************************************* > > Percentage of total database.......... 100.0% > Number of entries..................... 110002 > Bytes of storage consumed............. 2659328 > Bytes of payload...................... 2001976 75.3% > Average payload per entry............. 18.20 > Average unused bytes per entry........ 1.27 > Average fanout........................ 95.00 > Fragmentation......................... 67.9% > Maximum payload per entry............. 156 > Entries that use overflow............. 0 0.0% > Index pages used...................... 18 > Primary pages used.................... 2579 > Overflow pages used................... 0 > Total pages used...................... 2597 > Unused bytes on index pages........... 3430 18.6% > Unused bytes on primary pages......... 136177 5.2% > Unused bytes on overflow pages........ 0 > Unused bytes on all pages............. 139607 5.2% > > *** All tables ******************************************************* > > Percentage of total database.......... 66.2% > Number of entries..................... 55002 > Bytes of storage consumed............. 1759232 > Bytes of payload...................... 1374870 78.2% > Average payload per entry............. 25.00 > Average unused bytes per entry........ 0.77 > Average fanout........................ 95.00 > Fragmentation......................... 51.5% > Maximum payload per entry............. 156 > Entries that use overflow............. 0 0.0% > Index pages used...................... 18 > Primary pages used.................... 1700 > Overflow pages used................... 0 > Total pages used...................... 1718 > Unused bytes on index pages........... 3430 18.6% > Unused bytes on primary pages......... 38731 2.2% > Unused bytes on overflow pages........ 0 > Unused bytes on all pages............. 42161 2.4% > > *** All indices ****************************************************** > > Percentage of total database.......... 33.8% > Number of entries..................... 55000 > Bytes of storage consumed............. 900096 > Bytes of payload...................... 627106 69.7% > Average payload per entry............. 11.40 > Average unused bytes per entry........ 1.77 > Fragmentation......................... 100.0% > Maximum payload per entry............. 12 > Entries that use overflow............. 0 0.0% > Primary pages used.................... 879 > Overflow pages used................... 0 > Total pages used...................... 879 > Unused bytes on primary pages......... 97446 10.8% > Unused bytes on overflow pages........ 0 > Unused bytes on all pages............. 97446 10.8% > > > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users