[ https://issues.apache.org/jira/browse/DERBY-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17306223#comment-17306223 ]
Holger Rehn edited comment on DERBY-7109 at 3/22/21, 2:18 PM: -------------------------------------------------------------- I was able to put together a simplified test reproducing the described problem. It's enough to simply add a number of rows into a single table and then remove them all. Turns out my previous assumptions were incorrect and just side-effects of the way my 1st test worked. The key point seems to be raw table size. This are my findings: # if a table grows above a certain size (~2.x GiB), inplace compression seems only to be able to reclaim the disk space exceeding that boundary # if a table grows above X times that size, inplace compression can only reclaim the disk space exceeding X times that limit # its enough to insert rows until the table reaches that boundary and then remove them all, I didn't find additional requirements: ** the DB doesn't have to be fragmented ** the table doesn't have to have an index ** the row data doesn't have to be of the same (or a certain) size ** the table doesn't have to have a BLOB column ** it doesn't make any difference if the rows are removed one by one using an ID or altogether by a global "delete from" The attached test is using a table with a single BLOB column and larger chunks of data for performance reasons. I also tried with a table only containing a few BIGINT columns: The results are comparable and the problem can be reproduced. But without a BLOB column the test runs ~2 hours instad of a few seconds just inserting rows to reach the critical DB size. Apart from that the test should be self-explanatory. One more thing that caught my eye: If i modify the test to produce less than the critical amount of table data, SYSCS_INPLACE_COMPRESS_TABLE does even reclaim *more* disk space than SYSCS_COMPRESS_TABLE. Is that an expected behaviour? From reading the docs I wouldn't expect that to be possible. was (Author: ickzon): I was able to put together a simplified test reproducing the described problem. It's enough to simply add a number of rows into a single table and then remove them all. Turns out my previous assumptions were incorrect and just side-effects of the way my 1st test worked. The key point seems to be raw table size. This are my findings: # if a table grows above a certain size (~2.x GiB), inplace compression seems only to be able to reclaim the disk space exceeding that boundary # if a table grows above X times that size, inplace compression can only reclaim the disk space exceeding X times that limit # its enough to insert rows until the table reaches that size limit and then remove them all, I didn't find additional requirements: ** the DB doesn't have to be fragmented ** the table doesn't have to have an index ** the row data doesn't have to be of the same (or a certain) size ** the table doesn't have to have a BLOB column ** it doesn't make any difference if the rows are removed one by one using an ID or altogether by a global "delete from" The attached test is using a table with a single BLOB column and larger chunks of data for performance reasons. I also tried with a table only containing a few BIGINT columns: The results are comparable and the problem can be reproduced. But without a BLOB column the test runs ~2 hours instad of a few seconds just inserting rows to reach the critical DB size. Apart from that the test should be self-explanatory. One more thing that caught my eye: If i modify the test to produce less than the critical amount of table data, SYSCS_INPLACE_COMPRESS_TABLE does even reclaim *more* disk space than SYSCS_COMPRESS_TABLE. Is that an expected behaviour? From reading the docs I wouldn't expect that to be possible. > SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE doesn't work as expected > ---------------------------------------------------------------- > > Key: DERBY-7109 > URL: https://issues.apache.org/jira/browse/DERBY-7109 > Project: Derby > Issue Type: Bug > Components: Store > Affects Versions: 10.14.2.0, 10.15.2.0 > Reporter: Holger Rehn > Priority: Major > Attachments: DerbyTest.java > > > The SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE procedure seems to be unable to > reclaim any disk space under almost all circumstances. I have a number of > existing DBs that I tried to reorganize to reclaim disk space using the > mentioned procedure. Because that didn't even free the smallest amount of > space, I started a number of tests and it looks like inplace compression > simply doesn't work, at all. > Example: > One of the DBs uses ~37 GB of disk space and the exact same amount after > running SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE. After using > SYSCS_UTIL.SYSCS_COMPRESS_TABLE to reorganize the DB it only uses 20,4 GB of > disk space. This might be correct, if I correctly understand how inplace > compression should work - but is not really what I expected. But even after I > completely deleted a number of tables and removed more than 90% of the rows > from all other tables of the original DB, > SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE was unable to reclaim any disk space. > In contrast, SYSCS_UTIL.SYSCS_COMPRESS_TABLE now reduced the DB size to ~ > 130MB. To carry this to the extreme, I completely removed any data from all > tables. As expected, procedure SYSCS_UTIL.SYSCS_COMPRESS_TABLE now results in > an empty DB only taking up ~4 MB of disk space, while > SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE was still unable to reclaim a > relevant amount of disk space (only ~ 4,6GB), leaving a completely empty DB > still occupying more than 32 GB of disk space. > In all calls to SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE I set PURGE_ROWS, > DEFRAGMENT_ROWS, and TRUNCATE_END to 1. -- This message was sent by Atlassian Jira (v8.3.4#803005)