assam258-5892 opened a new issue, #1222:
URL: https://github.com/apache/cloudberry/issues/1222

   ### Apache Cloudberry version
   
   # GitHub Issue Report
   
   ## 🏷️ Issue Title
   ```
   Assert failure in GIN index due to MaxHeapTuplesPerPageBits modification
   ```
   
   ## 📋 Issue Template
   
   ### **Environment**
   - **Repository:** apache/cloudberry
   - **Tag:** 2.0.0-incubating-rc2
   - **Component:** GIN Index
   - **Severity:** High
   - **Priority:** P0
   
   ### **Labels**
   - `bug`
   - `gin-index`
   - `append-only-tables`
   - `assert-failure`
   - `high-priority`
   
   ### **Summary**
   Assert failure occurs in GIN index operations when using append-only tables 
due to validation logic inconsistency between `MaxHeapTuplesPerPageBits` 
modification and `OffsetNumberIsValid` macro.
   
   ### **Problem Description**
   When porting from PostgreSQL to Cloudberry, `MaxHeapTuplesPerPageBits` was 
modified from 11 to 16 bits to support append-only table optimization. However, 
the validation logic in `OffsetNumberIsValid` was not updated accordingly, 
causing Assert failures when processing large OffsetNumbers that are valid 
within the 16-bit range but exceed the heap-based limit.
   
   ### **Root Cause**
   ```c
   // MaxHeapTuplesPerPageBits was modified to support 16-bit range
   #define MaxHeapTuplesPerPageBits    16  // Maximum 65536
   
   // But OffsetNumberIsValid still uses heap-based MaxOffsetNumber
   #define OffsetNumberIsValid(offsetNumber) \
       ((offsetNumber != InvalidOffsetNumber) && \
        (offsetNumber <= MaxOffsetNumber))  // Still ~291 for heap tables
   ```
   
   ### **Assert Location**
   - **File:** `src/backend/access/gin/ginpostinglist.c`
   - **Line:** 338
   - **Code:** 
`Assert(OffsetNumberIsValid(ItemPointerGetOffsetNumber(&segment->first)));`
   
   ### **Steps to Reproduce**
   1. Create an append-only table:
      ```sql
      CREATE TABLE test_ao (id int, data text) WITH (appendonly=true);
      ```
   
   2. Create a GIN index:
      ```sql
      CREATE INDEX idx_test_ao_gin ON test_ao USING gin(to_tsvector('english', 
data));
      ```
   
   3. Insert large amount of data to generate large OffsetNumbers:
      ```sql
      INSERT INTO test_ao SELECT generate_series(1, 100000), 'test data ' || 
generate_series(1, 100000);
      ```
   
   4. Query using the GIN index:
      ```sql
      SELECT * FROM test_ao WHERE to_tsvector('english', data) @@ 
to_tsquery('test');
      ```
   
   ### **Expected Behavior**
   - GIN index should work properly with append-only tables
   - Large OffsetNumbers (up to 65535) should be accepted as valid
   - No Assert failures should occur
   
   ### **Actual Behavior**
   - Assert failure occurs in `ginpostinglist.c` line 338
   - Error: `OffsetNumberIsValid` validation fails for valid OffsetNumbers > 291
   - Example: OffsetNumber = 30000 (valid in 16-bit range) fails validation 
because 30000 > 291
   
   ### **Failure Scenario**
   ```c
   // Append-only table generates large OffsetNumber
   OffsetNumber large_offset = 30000;  // Valid within 16-bit range
   
   // Validation fails due to heap-based limit
   Assert(OffsetNumberIsValid(large_offset));  // 30000 > 291 → Assert failure!
   ```
   
   ### **Impact**
   - **High severity:** Causes application crashes
   - **Blocks functionality:** Prevents using GIN indexes on append-only tables
   - **Affects:** All append-only tables with GIN indexes containing large 
OffsetNumbers
   - **Data integrity:** No data corruption, but functionality is blocked
   
   ### **Proposed Solution**
   
   #### **Option 1: Immediate Fix (Recommended)**
   Modify line 338 in `ginpostinglist.c` to use 16-bit range validation:
   
   ```c
   // Original code:
   Assert(OffsetNumberIsValid(ItemPointerGetOffsetNumber(&segment->first)));
   
   // Proposed fix:
   {
       OffsetNumber offset = ItemPointerGetOffsetNumber(&segment->first);
       Assert(offset != InvalidOffsetNumber);
       Assert(offset <= ((1 << MaxHeapTuplesPerPageBits) - 1));
   }
   ```
   
   #### **Option 2: Long-term Improvement**
   Add new macro in `src/include/access/gin_private.h`:
   
   ```c
   #define MaxGinOffsetNumber ((1 << MaxHeapTuplesPerPageBits) - 1)
   #define GinOffsetNumberIsValid(offsetNumber) \
       ((offsetNumber != InvalidOffsetNumber) && \
        (offsetNumber <= MaxGinOffsetNumber))
   
   // Then modify ginpostinglist.c line 338:
   Assert(GinOffsetNumberIsValid(ItemPointerGetOffsetNumber(&segment->first)));
   ```
   
   ### **Advantages of Proposed Solution**
   - ✅ **Minimal invasive:** Changes only 1 line for immediate fix
   - ✅ **Safe:** Maintains existing data compatibility
   - ✅ **Preserves optimization:** Keeps append-only table 16-bit optimization
   - ✅ **Low risk:** Minimal code changes reduce introduction of new bugs
   - ✅ **Immediate:** Can be applied as a hotfix
   
   ### **Alternative Solutions Considered**
   1. **Revert MaxHeapTuplesPerPageBits to 11:** ❌ Causes data corruption and 
loses append-only optimization
   2. **Complete redesign:** ❌ High risk and time-consuming
   3. **Global OffsetNumberIsValid modification:** ❌ May affect other 
components unexpectedly
   
   ### **Testing Requirements**
   1. **Functional testing:** Verify GIN index works with append-only tables
   2. **Regression testing:** Ensure heap tables still work correctly
   3. **Performance testing:** Verify no performance degradation
   4. **Edge case testing:** Test with maximum OffsetNumber values
   
   ### **Additional Context**
   This issue is specific to Cloudberry's append-only table optimization. 
PostgreSQL users are not affected as they use the original 11-bit 
implementation. The modification was made to support large-scale OLAP workloads 
and columnar storage optimization in Cloudberry.
   
   ### **Files to be Modified**
   - `src/backend/access/gin/ginpostinglist.c` (line 338) - Primary fix
   - `src/include/access/gin_private.h` (optional) - Long-term improvement
   
   ### **Risk Assessment**
   - **Risk Level:** Low (minimal code change)
   - **Rollback:** Easy (single line change)
   - **Testing:** Straightforward test cases
   - **Compatibility:** Maintains backward compatibility
   
   
   ### What happened
   
   Crash on Assert Point.
   
   ### **Assert Location**
   - **File:** `src/backend/access/gin/ginpostinglist.c`
   - **Line:** 338
   - **Code:** 
`Assert(OffsetNumberIsValid(ItemPointerGetOffsetNumber(&segment->first)));`
   
   ### What you think should happen instead
   
   ### **Problem Description**
   When porting from PostgreSQL to Cloudberry, `MaxHeapTuplesPerPageBits` was 
modified from 11 to 16 bits to support append-only table optimization. However, 
the validation logic in `OffsetNumberIsValid` was not updated accordingly, 
causing Assert failures when processing large OffsetNumbers that are valid 
within the 16-bit range but exceed the heap-based limit.
   
   ### How to reproduce
   
   It's difficult to make a case because the data is inside the corporate 
security net.
   
   ### Operating System
   
   All
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes, I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/cloudberry/blob/main/CODE_OF_CONDUCT.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to