Re: [I] ORCA: limit is not pushed down to the segment if table is not analyzed [cloudberry]

via GitHub Wed, 25 Mar 2026 09:45:12 -0700


yjhjstz commented on issue #1638:
URL: https://github.com/apache/cloudberry/issues/1638#issuecomment-4128116796


   ```c++
   Aggregate  (cost=0.00..437.00 rows=1 width=8)
        ->  Limit  (cost=0.00..437.00 rows=46233 width=1)            ← Global 
Limit only, no pushdown
              ->  Gather Motion 3:1  (rows=46233)
                    ->  Sort  (rows=15411)                             ← 15411 
rows/segment
                          ->  Seq Scan on mv_large  (rows=15411)
   ```
   
     PAX table, no ANALYZE, gp_enable_relsize_collection=on.
   
     cdb_estimate_rel_size dispatches pg_relation_size() to QEs to get the 
actual PAX file size. But PAX stores data in columnar layout, not heap pages. 
The density formula assumes heap
     layout:
   
     density = (BLCKSZ - SizeOfPageHeaderData) / tuple_width;  // ~204 
tuples/page
     tuples = density * curpages;
   
     This gives 15411 rows/segment — 16x under-estimate (actual is 
333333/segment).
   
     For ORCA to push limit down, it needs per-segment rows > LIMIT/segments:
     - Required: 15411 > 100000/3 = 33334
     - Actual: 15411 < 33334 — not satisfied
   
     The local limit would process 15411 rows with a cap of 33334 — it reduces 
nothing. The split plan adds a LocalLimit operator with cost > 0 but saves 
nothing on the Gather Motion.
     ORCA correctly picks the cheaper non-split plan given its (wrong) estimate.
   
     The root cause: cdb_estimate_rel_size uses a heap-specific density model 
that doesn't apply to PAX columnar storage. Even with real file sizes, the row 
estimate is fundamentally
     inaccurate for non-heap table AMs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] ORCA: limit is not pushed down to the segment if table is not analyzed [cloudberry]

Reply via email to