Am 28.11.2008 um 09:20 schrieb Simon Bulman:

> Hi Jens,
>
> Thanks for your input. UTF-8 did not make a difference. I expected  
> that
> SQLite file would be larger on disk than our proprietary format  
> because of
> the overheads that you mention - I am surprised however it at least 7x
> larger.

To be honest - given your table definitions below, I'm surprised the  
database is _that_ large, too:

Table 1 - according to your definition - should contain at most about  
50 bytes of pure data per row (plus the overhead needed by SQLite).
Table 2 would only contain ~ 16 bytes of data per row.

Dividing the database disk size by the total number of rows you  
mentioned, would indicate a whopping 8k per row.

I did a quick test and created a schema similar to what you outlined  
and filled it with data (the same number of rows you mentioned and 28  
and ~20 characters per row for the two varchar columns) and my  
database ended up being 71kB in size instead of the 11.8 MB you saw...

Are there any other tables that contain non-negligible amounts of data?
Are the data sizes indeed what's indicated in the schema (since SQLite  
doesn't really care about the varchar size constraints you can  
actually put any amount of data into a varchar(30) column) ?

</jum>


>
> I am actually recreating the whole database (delete file and recreate)
> programmatically so vacuuming has not effect.
>
> Cheers,
> S.
>
> -----Original Message-----
> From: Jens Miltner [mailto:[EMAIL PROTECTED]
> Sent: 27 November 2008 13:48
> To: General Discussion of SQLite Database
> Cc: [EMAIL PROTECTED]
> Subject: Re: [sqlite] Database file size
>
>
> Am 27.11.2008 um 09:12 schrieb Simon Bulman:
>
>> I have been playing around with SQLite to use as an alternative to
>> one of
>> our proprietary file formats used to read large amounts of data. Our
>> proprietary format performs very badly i.e. takes a long time to
>> load some
>> data; as expected SQLite is lighting quick in comparison - great!
>>
>> One considerable stumbling block is the footprint (size) of the
>> database
>> file on disk. It turns out that SQLite is roughly 7x larger than our
>> proprietary format - this is prohibitive. The data is pretty simple
>> really,
>> 2 tables
>>
>> Table 1
>>
>> BIGINT (index),  VARCHAR(30), VARCHAR(10)
>>
>>
>> Table 2
>>
>> BIGINT (index), FLOAT
>>
>>
>> For a particular data set Table1 has 1165 rows and Table 2 has 323
>> rows,
>> however typically Table 2 becomes bigger for larger models. The size
>> on disk
>> of this file is 11.8 Mb (compared to 1.7 Mb for our proprietary
>> format). I
>> have noticed that if I drop the indexes the size drops dramatically -
>> however the query performance suffers to an unacceptable level.
>>
>> For a larger model the DB footprint is 2.2 Gb compared to 267 Mb for
>> the
>> proprietary format.
>>
>> Does anybody have any comments on this? Are there any configuration
>> options
>> or ideas I could use to reduce the footprint of the db file?
>
>
> I don't think you'll be able to make SQLite as efficient (regarding
> storage size) as a custom file format, because it has to have some
> overhead for indexes, etc.
>
> However, one thing that comes to mind is the way string data is  
> stored:
> If you're concerned about disk space an your string data is mostly
> ASCII, make sure your strings are stored as UTF-8 - for ASCII string
> data, this will save you one byte per character in the string data
> storage.
> To enforce UTF-8 string storage, execute "PRAGMA encoding='UTF-8'" as
> the first command when creating the database (before you create and
> tables).
> You can query the format using "PRAGMA encoding" - UTF-16 encodings
> will store two bytes / character, regardless of the actual  
> characters...
>
> Note that this doesn't mean your database size will shrink to half the
> size - it merely means you'll be able to fit more rows onto a single
> page, thus eventually you should see a decrease in file size when
> comparing UTF-16 vs. UTF-8 databases.
>
> BTW: are you aware that SQLite database won't shrink by themselves?
> You'll have to vacuum them to reclaim unused space (see
> <http://www.sqlite.org/faq.html#q12
>> )
>
> HTH,
> </jum>
>

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to