*D. R.* Hipp,

Thank you so much for looking at the database file for me.  I'll explain
what I found out so if anyone else runs into the same problem it'll be more
clear.

I am using a JDBC driver, and JDBC is supposed to be a standard API, but
there are different versions.  In many of the databases I've tried I have
not had to close a preparedstatement after a batch commit.  Using the xerial
JDBC driver, I have to close it and re-prepare the statement after every
batch commit or else something screwy goes on and more records are added
than expected.  My guess is that batch queries from the last commit were not
removed properly and were done over again multiple times.

The reason I did not initially check the number of records was because I had
tested the same JDBC code against other databases and the record number was
what was expected.  This xerial JDBC driver was a little unusual, and does
not seem to have the same implementation as the others.

Thanks again for your help,
Julian

On Mon, Oct 27, 2008 at 9:55 AM, D. Richard Hipp <[EMAIL PROTECTED]> wrote:

>
> On Oct 26, 2008, at 10:01 PM, Julian Bui wrote:
>
>  Hi Dr. Hipp,
>>
>> Thanks for responding (and without rudeness).
>>
>> I'm running my db from a java app within linux using the latest version of
>> sqlite released and the latest version of the xerial jdbc driver.  Again, I
>> only index over a single column which holds a double.
>>
>> I start with a clean database and insert rows and find out how big the
>> database is.
>>
>> ACTUAL RUNS:
>> 100 records = 3.0 KB
>> 1,000 records = 49 KB
>> 10,000 records = 2.5 MB
>> 100,000 records = 240 MB
>>
>> EXPECTED RESULTS - with 28 bytes per record estimate of actual data (not
>> including index):
>> 100 records = 100 * 28 = 2.73 KB (agrees with actual run)
>> 1000 records = 1000 * 28 = 27.34 KB (slipping away from actual run size,
>> but still within the same magnitude)
>> 10,000 records = 10,000 * 28 = 0.267 MB (disagrees with actual result, is
>> index size growing exponentially?)
>> 100,000 records = 100,000 * 28 = 2.67 MB (disagrees with actual result, is
>> index size growing exponentially?)
>>
>> I've attached the 100 and 10,000 record dbs.
>>
>>
>
> I ran the sqlite3_analyzer program against your db10000.db database file.
>  And despite its name, it appears to contain 55002 entries, not 10,000 as
> claimed.  The average size of each entry in the table is 26 bytes and each
> index entry consumes about 13 bytes.  55001*(26+13) is about 2.1MB which is
> not far from the actual database file size of 2.6MB.
>
> I think the growth in database size is probably occurring because you are
> inserting more values into the database than you think you are.
>
> The first few pages of the output of sqlite3_analyzer are attached.
>
> D. Richard Hipp
> [EMAIL PROTECTED]
>
> /** Disk-Space Utilization Report For db10000.db
> *** As of 2008-Oct-27 12:45:37
>
> Page size in bytes.................... 1024
> Pages in the whole file (measured).... 2597
> Pages in the whole file (calculated).. 2597
> Pages that store data................. 2597       100.0%
> Pages on the freelist (per header).... 0            0.0%
> Pages on the freelist (calculated).... 0            0.0%
> Pages of auto-vacuum overhead......... 0            0.0%
> Number of tables in the database...... 2
> Number of indices..................... 1
> Number of named indices............... 1
> Automatically generated indices....... 0
> Size of the file in bytes............. 2659328
> Bytes of user payload stored.......... 1374633     51.7%
>
> *** Page counts for all tables with their indices ********************
>
> CAN_MESSAGES.......................... 2596        99.961%
> SQLITE_MASTER......................... 1            0.039%
>
> *** All tables and indices *******************************************
>
> Percentage of total database.......... 100.0%
> Number of entries..................... 110002
> Bytes of storage consumed............. 2659328
> Bytes of payload...................... 2001976     75.3%
> Average payload per entry............. 18.20
> Average unused bytes per entry........ 1.27
> Average fanout........................ 95.00
> Fragmentation.........................  67.9%
> Maximum payload per entry............. 156
> Entries that use overflow............. 0            0.0%
> Index pages used...................... 18
> Primary pages used.................... 2579
> Overflow pages used................... 0
> Total pages used...................... 2597
> Unused bytes on index pages........... 3430        18.6%
> Unused bytes on primary pages......... 136177       5.2%
> Unused bytes on overflow pages........ 0
> Unused bytes on all pages............. 139607       5.2%
>
> *** All tables *******************************************************
>
> Percentage of total database..........  66.2%
> Number of entries..................... 55002
> Bytes of storage consumed............. 1759232
> Bytes of payload...................... 1374870     78.2%
> Average payload per entry............. 25.00
> Average unused bytes per entry........ 0.77
> Average fanout........................ 95.00
> Fragmentation.........................  51.5%
> Maximum payload per entry............. 156
> Entries that use overflow............. 0            0.0%
> Index pages used...................... 18
> Primary pages used.................... 1700
> Overflow pages used................... 0
> Total pages used...................... 1718
> Unused bytes on index pages........... 3430        18.6%
> Unused bytes on primary pages......... 38731        2.2%
> Unused bytes on overflow pages........ 0
> Unused bytes on all pages............. 42161        2.4%
>
> *** All indices ******************************************************
>
> Percentage of total database..........  33.8%
> Number of entries..................... 55000
> Bytes of storage consumed............. 900096
> Bytes of payload...................... 627106      69.7%
> Average payload per entry............. 11.40
> Average unused bytes per entry........ 1.77
> Fragmentation......................... 100.0%
> Maximum payload per entry............. 12
> Entries that use overflow............. 0            0.0%
> Primary pages used.................... 879
> Overflow pages used................... 0
> Total pages used...................... 879
> Unused bytes on primary pages......... 97446       10.8%
> Unused bytes on overflow pages........ 0
> Unused bytes on all pages............. 97446       10.8%
>
>
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to