Hi,

I checked the download mentioned in the original email. Not sure if the
table changed since the previous posts.


It seems LUTFullString has 3 BLOB rows, but LENGTH treats them as
strings.

I'm in Melbourne, Oz, so I added the UTC datetime.

regs, Kev

kevin@KCYDell:~$ cd /mnt/KCY/KCYDocs/
kevin@KCYDell:/mnt/KCY/KCYDocs$ sqlite3 /mnt/KCY/KCYDocs/200k-per-
row.sqlite
SQLite version 3.30.1 2019-10-10 20:19:45
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);
sqlite> 
sqlite> select '1',rowid,LENGTH(HEX(LUTFullString))/2 from copied
   ...> UNION ALL
   ...> select '2',rowid,INSTR(HEX(LUTFullString),'00') from copied
   ...> UNION ALL
   ...> select '3',rowid,substr(HEX(LUTFullString),0,10) from copied
   ...> UNION ALL
   ...> select '4',rowid,INSTR(SUBSTR(HEX(LUTFullString),4),'00') from
copied
   ...> UNION ALL
   ...> select '5',rowid,LENGTH(LUTFullString) from copied;
1|1|194238
1|2|183050
1|3|193908
2|1|3
2|2|3
2|3|3
3|1|2C0003007
3|2|2C0003007
3|3|2C0003007
4|1|1
4|2|1
4|3|1
5|1|1
5|2|1
5|3|1
sqlite> .quit
kevin@KCYDell:/mnt/KCY/KCYDocs$ date -u
Sat 11 Jan 23:39:43 UTC 2020
kevin@KCYDell:/mnt/KCY/KCYDocs$ 









Message: 6
Date: Fri, 10 Jan 2020 08:48:21 -0500
From: Ryan Mack <ryan.m...@gmail.com>
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] Unexplained table bloat
Message-ID:
        <
CABhGdGRbR1kT+3_BU6ob9L7tpSPZ09HJn=ofPyK6OXvgQK=_...@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"

Hi list,

I'm trying to understand unexplained table bloat I found in what should
be
a very small table in an old database file. If you dump/restore the
database, the problem goes away. If you duplicate the table, the
problem
propagates. Schema:

CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);

Test Data:
85960605|,|0DE19F8AA100D2962FF22C60B3FA8940
85960465|,|A44615408E8B3E48A684C60CA2967031
85960451|,|33C8804354984071A740788AD308B279

In my test database file, each of these 3 rows is allocating about 40
4k
overflow pages for a total database size of about 500k. The full
database
has about 4MB of actual data which takes up over 500MB on disk. If you
want
to see/reproduce the problem you'll need my test database file which
I've
uploaded here: https://mackman.net/200k-per-row.sqlite.zip (500kb
download). I don't know why it doesn't compress better, those extra
overflow pages must be getting filled with random garbage.

My uninformed guess is there was a bug in the version of sqlite used at

database creation time that computed an incorrect overflow threshold
and is
storing each byte of the row to its own page. Since the problem goes
away
with a dump/restore, I'm considering releasing a script to do that and
mitigate the problem for affected users. Before doing that I would like
to
understand the problem better.

Thanks for reading, Ryan

PS: Here's some output from my debug session showing the 123 bytes of
data
is occupying 582k of space on disk in a freshly created table.

% sqlite3 200k-per-row.sqlite
SQLite version 3.28.0 2019-04-15 14:49:49
Enter ".help" for usage hints.
sqlite> .tables
copied

sqlite> .schema copied
CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);

sqlite> select * from copied;
85960605|,|0DE19F8AA100D2962FF22C60B3FA8940
85960465|,|A44615408E8B3E48A684C60CA2967031
85960451|,|33C8804354984071A740788AD308B279

sqlite> select sum(length(id_local) + length(lutfullstring) +
length(luthash)) from copied;
123

sqlite> create table copied2 as select * from copied;
sqlite> SELECT name, SUM("pgsize") as sz FROM dbstat group by name;
copied|581632
copied2|581632
sqlite_master|4096



_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to