This is an automated email from the ASF dual-hosted git repository.

maxyang pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/cloudberry.git


The following commit(s) were added to refs/heads/main by this push:
     new ebc52ca59bf PAX: Adapter the capacity of column by GUC
ebc52ca59bf is described below

commit ebc52ca59bfa93b363d490911cdd11920390026a
Author: Hao Wu <gfphoeni...@gmail.com>
AuthorDate: Fri Aug 29 12:03:17 2025 +0000

    PAX: Adapter the capacity of column by GUC
    
    The initial size of pax column is hard-coded to 2048. It may consume much
    memory for all pax columns when the inserted table is a large partition
    table. For extream example, the partition table has 2000+ leaf-partition
    tables, and each table has 300 columns. The writer backend will writes
    600000+ columns in the same time. The initial capacity of column will
    affect the memory usage greatly.
    
    When the GUC pax_max_tuples_per_group is low, the large initial size
    will cause high memory usage that most of them is wasted.
---
 contrib/pax_storage/src/cpp/storage/columns/pax_column.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/pax_storage/src/cpp/storage/columns/pax_column.h 
b/contrib/pax_storage/src/cpp/storage/columns/pax_column.h
index b713dbc73fb..45c78bc0de0 100644
--- a/contrib/pax_storage/src/cpp/storage/columns/pax_column.h
+++ b/contrib/pax_storage/src/cpp/storage/columns/pax_column.h
@@ -46,7 +46,7 @@
 
 namespace pax {
 
-#define DEFAULT_CAPACITY 2048
+#define DEFAULT_CAPACITY MIN(2048, MAX(16, 
MAXALIGN(pax::pax_max_tuples_per_group)))
 
 // Used to mapping pg_type
 enum PaxColumnTypeInMem {


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cloudberry.apache.org
For additional commands, e-mail: commits-h...@cloudberry.apache.org

Reply via email to