[ 
https://issues.apache.org/jira/browse/DERBY-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17306223#comment-17306223
 ] 

Holger Rehn edited comment on DERBY-7109 at 3/22/21, 2:18 PM:
--------------------------------------------------------------

I was able to put together a simplified test reproducing the described problem. 
It's enough to simply add a number of rows into a single table and then remove 
them all. Turns out my previous assumptions were incorrect and just 
side-effects of the way my 1st test worked. The key point seems to be raw table 
size. This are my findings:
 # if a table grows above a certain size (~2.x GiB), inplace compression seems 
only to be able to reclaim the disk space exceeding that boundary
 # if a table grows above X times that size, inplace compression can only 
reclaim the disk space exceeding X times that limit
 # its enough to insert rows until the table reaches that boundary and then 
remove them all, I didn't find additional requirements: 
 ** the DB doesn't have to be fragmented
 ** the table doesn't have to have an index
 ** the row data doesn't have to be of the same (or a certain) size
 ** the table doesn't have to have a BLOB column
 ** it doesn't make any difference if the rows are removed one by one using an 
ID or altogether by a global "delete from"

The attached test is using a table with a single BLOB column and larger chunks 
of data for performance reasons. I also tried with a table only containing a 
few BIGINT columns: The results are comparable and the problem can be 
reproduced. But without a BLOB column the test runs ~2 hours instad of a few 
seconds just inserting rows to reach the critical DB size. Apart from that the 
test should be self-explanatory. 

One more thing that caught my eye: If i modify the test to produce less than 
the critical amount of table data, SYSCS_INPLACE_COMPRESS_TABLE does even 
reclaim *more* disk space than SYSCS_COMPRESS_TABLE. Is that an expected 
behaviour? From reading the docs I wouldn't expect that to be possible.


was (Author: ickzon):
I was able to put together a simplified test reproducing the described problem. 
It's enough to simply add a number of rows into a single table and then remove 
them all. Turns out my previous assumptions were incorrect and just 
side-effects of the way my 1st test worked. The key point seems to be raw table 
size. This are my findings:
 # if a table grows above a certain size (~2.x GiB), inplace compression seems 
only to be able to reclaim the disk space exceeding that boundary
 # if a table grows above X times that size, inplace compression can only 
reclaim the disk space exceeding X times that limit
 # its enough to insert rows until the table reaches that size limit and then 
remove them all, I didn't find additional requirements: 
 ** the DB doesn't have to be fragmented
 ** the table doesn't have to have an index
 ** the row data doesn't have to be of the same (or a certain) size
 ** the table doesn't have to have a BLOB column
 ** it doesn't make any difference if the rows are removed one by one using an 
ID or altogether by a global "delete from"

The attached test is using a table with a single BLOB column and larger chunks 
of data for performance reasons. I also tried with a table only containing a 
few BIGINT columns: The results are comparable and the problem can be 
reproduced. But without a BLOB column the test runs ~2 hours instad of a few 
seconds just inserting rows to reach the critical DB size. Apart from that the 
test should be self-explanatory. 

One more thing that caught my eye: If i modify the test to produce less than 
the critical amount of table data, SYSCS_INPLACE_COMPRESS_TABLE does even 
reclaim *more* disk space than SYSCS_COMPRESS_TABLE. Is that an expected 
behaviour? From reading the docs I wouldn't expect that to be possible.

> SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE doesn't work as expected
> ----------------------------------------------------------------
>
>                 Key: DERBY-7109
>                 URL: https://issues.apache.org/jira/browse/DERBY-7109
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.14.2.0, 10.15.2.0
>            Reporter: Holger Rehn
>            Priority: Major
>         Attachments: DerbyTest.java
>
>
> The SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE procedure seems to be unable to 
> reclaim any disk space under almost all circumstances. I have a number of 
> existing DBs that I tried to reorganize to reclaim disk space using the 
> mentioned procedure. Because that didn't even free the smallest amount of 
> space, I started a number of tests and it looks like inplace compression 
> simply doesn't work, at all.
> Example:
> One of the DBs uses ~37 GB of disk space and the exact same amount after 
> running SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE. After using 
> SYSCS_UTIL.SYSCS_COMPRESS_TABLE to reorganize the DB it only uses 20,4 GB of 
> disk space. This might be correct, if I correctly understand how inplace 
> compression should work - but is not really what I expected. But even after I 
> completely deleted a number of tables and removed more than 90% of the rows 
> from all other tables of the original DB, 
> SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE was unable to reclaim any disk space. 
> In contrast, SYSCS_UTIL.SYSCS_COMPRESS_TABLE now reduced the DB size to ~ 
> 130MB. To carry this to the extreme, I completely removed any data from all 
> tables. As expected, procedure SYSCS_UTIL.SYSCS_COMPRESS_TABLE now results in 
> an empty DB only taking up ~4 MB of disk space, while 
> SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE was still unable to reclaim a 
> relevant amount of disk space (only ~ 4,6GB), leaving a completely empty DB 
> still occupying more than 32 GB of disk space.
> In all calls to SYSCS_UTIL.SYSCS_INPLACE_COMPRESS_TABLE I set PURGE_ROWS, 
> DEFRAGMENT_ROWS, and TRUNCATE_END to 1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to