(cloudberry) 04/05: Toasting for AO tables should still use custom toast_tuple_target

reshke Wed, 29 Jan 2025 08:17:18 -0800

This is an automated email from the ASF dual-hosted git repository.

reshke pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/cloudberry.git


commit 3199861dfe337d49cc432e936a7752440da1d99d
Author: Huansong Fu <[email protected]>
AuthorDate: Wed Jun 22 14:10:35 2022 -0700

    Toasting for AO tables should still use custom toast_tuple_target
    
    In commit a0684da5f14f we made a change to use the fixed TOAST_TUPLE_TARGET
    instead of custom toast_tuple_target that's calculated based on the the
    blocksize of the AO table for toasting AO table tuples.
    
    This caused an issue that some tuples that are well beyond the
    toast_tuple_target are not being toasted (because they're smaller than
    TOAST_TUPLE_TARGET). This is ok when the tuples can still fit into the
    AO table's blocksize. But if not, an error would occur. E.g.:
    
      postgres=# create table t (a int, b int[]) WITH (appendonly=true, 
blocksize=8192);
      CREATE TABLE
      postgres=# insert into t select 1, array_agg(x) from generate_series(1, 
2030) x;
      ERROR:  item too long (check #1): length 8160, maxBufferLen 8168
      CONTEXT:  Append-Only table 't'
    
    Therefore, we should revert those changes, including:
    
    1. Still use the toast_tuple_target for toasting 'x' and 'e' columns for AO 
tables.
    There's also a GPDB_12_MERGE_FIXME in the original comment suggesting to use
    RelationGetToastTupleTarget for AO table (in order to unify the logic to 
calculate
    maxDataLen for heap and AO tuples, I suppose). But that's not possible 
currently
    because the calculated toast_tuple_target is stored in AppendOnlyInsertDesc 
and
    we cannot get it from the Relation struct. And it seems to me that we won't 
have
    a way to do that easily. Therefore, keep this FIXME comment being removed.
    
    2. Still use the toast_tuple_target for toasting 'm' columns for AO tables. 
Also restore
    the FIXME comment suggesting that we should use a bigger target size for 
'm' columns.
    This should be something that worth investigating more into.
    
    Commit a0684da5f14f also includes a change to use custom toast_tuple_target 
for heap tables,
    which should be a valid change. I think that fixed an oversight when 
merging the upstream
    commit c2513365a0a85e77d3c21adb92fe12cfbe0d1897.
---
 src/backend/access/heap/heaptoast.c | 16 +++++++++++-----
 src/test/regress/expected/toast.out | 13 +++++++++++++
 src/test/regress/sql/toast.sql      | 12 ++++++++++++
 3 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/heap/heaptoast.c 
b/src/backend/access/heap/heaptoast.c
index efe1087011..5fdf704106 100644
--- a/src/backend/access/heap/heaptoast.c
+++ b/src/backend/access/heap/heaptoast.c
@@ -210,10 +210,8 @@ heap_toast_insert_or_update_generic(Relation rel, void 
*newtup, void *oldtup,
        }
        else
        {
-               /* Since reloptions for AO table is not permitted, so using 
TOAST_TUPLE_TARGET */
-               hoff = sizeof(MemTupleData);
-               hoff = MAXALIGN(hoff);
-               maxDataLen = TOAST_TUPLE_TARGET - hoff;
+               maxDataLen = toast_tuple_target;
+               hoff = -1; /* keep compiler quiet about using 'hoff' 
uninitialized */
        }
 
        /*
@@ -295,7 +293,15 @@ heap_toast_insert_or_update_generic(Relation rel, void 
*newtup, void *oldtup,
         * increase the target tuple size, so that MAIN attributes aren't stored
         * externally unless really necessary.
         */
-       maxDataLen = TOAST_TUPLE_TARGET_MAIN - hoff;
+       /*
+        * FIXME: Should we do something like this with memtuples on
+        * AO tables too? Currently we do not increase the target tuple size 
for AO
+        * table, so there are occasions when columns of type 'm' will be stored
+        * out-of-line but they could otherwise be accommodated in-block
+        * c.f. upstream Postgres commit 
ca7c8168de76459380577eda56a3ed09b4f6195c
+        */
+       if (!ismemtuple)
+               maxDataLen = TOAST_TUPLE_TARGET_MAIN - hoff;
 
        while (heap_compute_data_size(tupleDesc,
                                                                  toast_values, 
toast_isnull) > maxDataLen &&
diff --git a/src/test/regress/expected/toast.out 
b/src/test/regress/expected/toast.out
index 9d97ab868c..6650facc0d 100644
--- a/src/test/regress/expected/toast.out
+++ b/src/test/regress/expected/toast.out
@@ -39,6 +39,19 @@ SELECT char_length(a), char_length(b), c FROM toastable_ao 
ORDER BY c;
     10000000 |    10000032 | 3
 (3 rows)
 
+-- Check that small tuples can be correctly toasted as long as it's beyond the 
toast
+-- target size (about 1/4 of the table's blocksize)
+CREATE TABLE toastable_ao2(a int, b int[]) WITH (appendonly=true, 
blocksize=8192);
+INSERT INTO toastable_ao2 SELECT 1, array_agg(x) from generate_series(1, 1000) 
x;
+INSERT INTO toastable_ao2 SELECT 1, array_agg(x) from generate_series(1, 2030) 
x;
+SELECT gp_segment_id, get_rel_toast_count('toastable_ao2') FROM 
gp_dist_random('gp_id') order by gp_segment_id;
+ gp_segment_id | get_rel_toast_count 
+---------------+---------------------
+             0 |                   0
+             1 |                   2
+             2 |                   0
+(3 rows)
+
 -- UPDATE 
 -- (heap rel only) and toast the large tuple
 UPDATE toastable_heap SET a=repeat('A',100000) WHERE c=1;
diff --git a/src/test/regress/sql/toast.sql b/src/test/regress/sql/toast.sql
index a35cb77e4b..bcf070acce 100644
--- a/src/test/regress/sql/toast.sql
+++ b/src/test/regress/sql/toast.sql
@@ -3,6 +3,10 @@ CREATE TABLE toastable_heap(a text, b varchar, c int) 
distributed by (b);
 CREATE TABLE toastable_ao(a text, b varchar, c int) with(appendonly=true, 
compresslevel=1) distributed by (b);
 ALTER TABLE toastable_ao ALTER COLUMN a SET STORAGE EXTERNAL;
 
+-- start_ignore
+create language plpython3u;
+-- end_ignore
+
 -- Helper function to generate a random string with given length. (Due to the
 -- implementation, the length is rounded down to nearest multiple of 32.
 -- That's good enough for our purposes.)
@@ -33,6 +37,14 @@ INSERT INTO toastable_ao VALUES(randomtext(10000000), 
randomtext(10000032), 3);
 SELECT char_length(a), char_length(b), c FROM toastable_heap ORDER BY c;
 SELECT char_length(a), char_length(b), c FROM toastable_ao ORDER BY c;
 
+-- Check that small tuples can be correctly toasted as long as it's beyond the 
toast
+-- target size (about 1/4 of the table's blocksize)
+CREATE TABLE toastable_ao2(a int, b int[]) WITH (appendonly=true, 
blocksize=8192);
+INSERT INTO toastable_ao2 SELECT 1, array_agg(x) from generate_series(1, 1000) 
x;
+INSERT INTO toastable_ao2 SELECT 1, array_agg(x) from generate_series(1, 2030) 
x;
+SELECT gp_segment_id, get_rel_toast_count('toastable_ao2') FROM 
gp_dist_random('gp_id') order by gp_segment_id;
+
+
 -- UPDATE 
 -- (heap rel only) and toast the large tuple
 UPDATE toastable_heap SET a=repeat('A',100000) WHERE c=1;


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(cloudberry) 04/05: Toasting for AO tables should still use custom toast_tuple_target

Reply via email to