OK, I think I've got a better sense now. Hex encoding the column shows that there's actually a huge amount of data stored in there. For some reason length() isn't revealing it even if the column type is blob. Dumping and restoring the table is truncating the data.
On Fri, Jan 10, 2020 at 9:58 AM Ryan Mack <ryan.m...@gmail.com> wrote: > > Accidentally sent my first reply direct, responding to the list. I'm > now wondering if there's a lot of binary data hidden in each row. > Trying to figure out how to determine that if length() doesn't show > anything. > > > Prior response: > > An excellent idea, thank you :-) . The output is included below. > > I am continuing to do my own debugging in parallel. I am now leaning > towards a new hypothesis that there may be a bug handling > strange/malformed unicode that is resulting in a large amount of > garbage data being stored into the record. I'm trying to figure out > the exact conditions of copying/loading data into the table that > causes the issue to propagate or disappear. > > > % ./sqlite3_analyzer ~/Downloads/200k-per-row.sqlite > /** Disk-Space Utilization Report For > /Users/nadia/Downloads/200k-per-row.sqlite > > Page size in bytes................................ 4096 > Pages in the whole file (measured)................ 143 > Pages in the whole file (calculated).............. 143 > Pages that store data............................. 143 100.0% > Pages on the freelist (per header)................ 0 0.0% > Pages on the freelist (calculated)................ 0 0.0% > Pages of auto-vacuum overhead..................... 0 0.0% > Number of tables in the database.................. 2 > Number of indices................................. 0 > Number of defined indices......................... 0 > Number of implied indices......................... 0 > Size of the file in bytes......................... 585728 > Bytes of user payload stored...................... 571322 97.5% > > *** Page counts for all tables with their indices > ***************************** > > COPIED............................................ 142 99.30% > SQLITE_MASTER..................................... 1 0.70% > > *** Page counts for all tables and indices separately > ************************* > > COPIED............................................ 142 99.30% > SQLITE_MASTER..................................... 1 0.70% > > *** All tables > **************************************************************** > > Percentage of total database...................... 100.0% > Number of entries................................. 4 > Bytes of storage consumed......................... 585728 > Bytes of payload.................................. 571412 97.6% > Bytes of metadata................................. 1284 0.22% > Average payload per entry......................... 142853.00 > Average unused bytes per entry.................... 3393.00 > Average metadata per entry........................ 321.00 > Average fanout.................................... 3.00 > Maximum payload per entry......................... 194280 > Entries that use overflow......................... 3 75.0% > Index pages used.................................. 1 > Primary pages used................................ 4 > Overflow pages used............................... 138 > Total pages used.................................. 143 > Unused bytes on index pages....................... 4070 99.37% > Unused bytes on primary pages..................... 9502 58.0% > Unused bytes on overflow pages.................... 0 0.0% > Unused bytes on all pages......................... 13572 2.3% > > *** Table COPIED > ************************************************************** > > Percentage of total database...................... 99.30% > Number of entries................................. 3 > Bytes of storage consumed......................... 581632 > Bytes of payload.................................. 571322 98.2% > Bytes of metadata................................. 1172 0.20% > B-tree depth...................................... 2 > Average payload per entry......................... 190440.67 > Average unused bytes per entry.................... 3226.00 > Average metadata per entry........................ 390.67 > Average fanout.................................... 3.00 > Non-sequential pages.............................. 0 0.0% > Maximum payload per entry......................... 194280 > Entries that use overflow......................... 3 100.0% > Index pages used.................................. 1 > Primary pages used................................ 3 > Overflow pages used............................... 138 > Total pages used.................................. 142 > Unused bytes on index pages....................... 4070 99.37% > Unused bytes on primary pages..................... 5608 45.6% > Unused bytes on overflow pages.................... 0 0.0% > Unused bytes on all pages......................... 9678 1.7% > > *** Table SQLITE_MASTER > ******************************************************* > > Percentage of total database...................... 0.70% > Number of entries................................. 1 > Bytes of storage consumed......................... 4096 > Bytes of payload.................................. 90 2.2% > Bytes of metadata................................. 112 2.7% > B-tree depth...................................... 1 > Average payload per entry......................... 90.00 > Average unused bytes per entry.................... 3894.00 > Average metadata per entry........................ 112.00 > Maximum payload per entry......................... 90 > Entries that use overflow......................... 0 0.0% > Primary pages used................................ 1 > Overflow pages used............................... 0 > Total pages used.................................. 1 > Unused bytes on primary pages..................... 3894 95.1% > Unused bytes on overflow pages.................... 0 > Unused bytes on all pages......................... 3894 95.1% > > On Fri, Jan 10, 2020 at 9:45 AM Richard Hipp <d...@sqlite.org> wrote: > > > > On 1/10/20, Ryan Mack <ryan.m...@gmail.com> wrote: > > > > > > I'm trying to understand unexplained table bloat > > > > The sqlite3_analyzer command-line utility program (available in the > > "Precompiled binaries" bundles on the https://sqlite.org/download.html > > page) is designed to help understand these kinds of problems. Please > > run that utility on the database and perhaps post the output here. > > -- > > D. Richard Hipp > > d...@sqlite.org _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users