This is an automated email from the ASF dual-hosted git repository.

alexey pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 37f780f00a5a5fdb754ac952896f6c3368d4256d
Author: Alexey Serbin <[email protected]>
AuthorDate: Thu Jan 30 12:36:01 2020 -0800

    [docs] note about space reclamation for deleted rows
    
    Added a note about not reclaiming the space after deleting rows from
    a table, urging for a proper schema design for larger fact tables.
    
    Change-Id: Ib49e86ed80af0325e3ceaceb9964749534755be4
    Reviewed-on: http://gerrit.cloudera.org:8080/15138
    Reviewed-by: Adar Dembo <[email protected]>
    Tested-by: Kudu Jenkins
---
 docs/schema_design.adoc | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/docs/schema_design.adoc b/docs/schema_design.adoc
index 3ecceed..9b05991 100644
--- a/docs/schema_design.adoc
+++ b/docs/schema_design.adoc
@@ -570,3 +570,14 @@ Non-alterable Column Types:: Kudu does not allow the type 
of a column to be
 altered.
 
 Partition Splitting:: Partitions cannot be split or merged after table 
creation.
+
+Deleted row disk space is not reclaimed:: The disk space occupied by a deleted
+row is only reclaimable via compaction, and only when the deletion's age
+exceeds the "tablet history maximum age" (controlled by the
+`--tablet_history_max_age_sec` flag). Furthermore, Kudu currently only 
schedules
+compactions in order to improve read/write performance; a tablet will never be
+compacted purely to reclaim disk space. As such, range partitioning should be
+used when it is expected that large swaths of rows will be discarded. With 
range
+partitioning, individual partitions may be dropped to discard data and reclaim
+disk space.  See 
link:https://issues.apache.org/jira/browse/KUDU-1625[KUDU-1625]
+for details.

Reply via email to