Accidentally sent my first reply direct, responding to the list. I'm now wondering if there's a lot of binary data hidden in each row. Trying to figure out how to determine that if length() doesn't show anything.
Prior response: An excellent idea, thank you :-) . The output is included below. I am continuing to do my own debugging in parallel. I am now leaning towards a new hypothesis that there may be a bug handling strange/malformed unicode that is resulting in a large amount of garbage data being stored into the record. I'm trying to figure out the exact conditions of copying/loading data into the table that causes the issue to propagate or disappear. % ./sqlite3_analyzer ~/Downloads/200k-per-row.sqlite /** Disk-Space Utilization Report For /Users/nadia/Downloads/200k-per-row.sqlite Page size in bytes................................ 4096 Pages in the whole file (measured)................ 143 Pages in the whole file (calculated).............. 143 Pages that store data............................. 143 100.0% Pages on the freelist (per header)................ 0 0.0% Pages on the freelist (calculated)................ 0 0.0% Pages of auto-vacuum overhead..................... 0 0.0% Number of tables in the database.................. 2 Number of indices................................. 0 Number of defined indices......................... 0 Number of implied indices......................... 0 Size of the file in bytes......................... 585728 Bytes of user payload stored...................... 571322 97.5% *** Page counts for all tables with their indices ***************************** COPIED............................................ 142 99.30% SQLITE_MASTER..................................... 1 0.70% *** Page counts for all tables and indices separately ************************* COPIED............................................ 142 99.30% SQLITE_MASTER..................................... 1 0.70% *** All tables **************************************************************** Percentage of total database...................... 100.0% Number of entries................................. 4 Bytes of storage consumed......................... 585728 Bytes of payload.................................. 571412 97.6% Bytes of metadata................................. 1284 0.22% Average payload per entry......................... 142853.00 Average unused bytes per entry.................... 3393.00 Average metadata per entry........................ 321.00 Average fanout.................................... 3.00 Maximum payload per entry......................... 194280 Entries that use overflow......................... 3 75.0% Index pages used.................................. 1 Primary pages used................................ 4 Overflow pages used............................... 138 Total pages used.................................. 143 Unused bytes on index pages....................... 4070 99.37% Unused bytes on primary pages..................... 9502 58.0% Unused bytes on overflow pages.................... 0 0.0% Unused bytes on all pages......................... 13572 2.3% *** Table COPIED ************************************************************** Percentage of total database...................... 99.30% Number of entries................................. 3 Bytes of storage consumed......................... 581632 Bytes of payload.................................. 571322 98.2% Bytes of metadata................................. 1172 0.20% B-tree depth...................................... 2 Average payload per entry......................... 190440.67 Average unused bytes per entry.................... 3226.00 Average metadata per entry........................ 390.67 Average fanout.................................... 3.00 Non-sequential pages.............................. 0 0.0% Maximum payload per entry......................... 194280 Entries that use overflow......................... 3 100.0% Index pages used.................................. 1 Primary pages used................................ 3 Overflow pages used............................... 138 Total pages used.................................. 142 Unused bytes on index pages....................... 4070 99.37% Unused bytes on primary pages..................... 5608 45.6% Unused bytes on overflow pages.................... 0 0.0% Unused bytes on all pages......................... 9678 1.7% *** Table SQLITE_MASTER ******************************************************* Percentage of total database...................... 0.70% Number of entries................................. 1 Bytes of storage consumed......................... 4096 Bytes of payload.................................. 90 2.2% Bytes of metadata................................. 112 2.7% B-tree depth...................................... 1 Average payload per entry......................... 90.00 Average unused bytes per entry.................... 3894.00 Average metadata per entry........................ 112.00 Maximum payload per entry......................... 90 Entries that use overflow......................... 0 0.0% Primary pages used................................ 1 Overflow pages used............................... 0 Total pages used.................................. 1 Unused bytes on primary pages..................... 3894 95.1% Unused bytes on overflow pages.................... 0 Unused bytes on all pages......................... 3894 95.1% On Fri, Jan 10, 2020 at 9:45 AM Richard Hipp <d...@sqlite.org> wrote: > > On 1/10/20, Ryan Mack <ryan.m...@gmail.com> wrote: > > > > I'm trying to understand unexplained table bloat > > The sqlite3_analyzer command-line utility program (available in the > "Precompiled binaries" bundles on the https://sqlite.org/download.html > page) is designed to help understand these kinds of problems. Please > run that utility on the database and perhaps post the output here. > -- > D. Richard Hipp > d...@sqlite.org _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users