Re: [PR] Do not call gporca for simple queries [cloudberry]

via GitHub Sat, 08 Feb 2025 02:31:52 -0800


avamingli commented on PR #900:
URL: https://github.com/apache/cloudberry/pull/900#issuecomment-2644851611


   Hi, thanks for your pointing it out, but I have significant concerns about 
this approach and implications. 
   
   Defining what constitutes a "simple" query is inherently subjective and 
context-dependent. 
   The current implementation attempts to formalize this, but it feels overly 
simplistic and overlooks critical cases like subqueries.The choice of optimizer 
is fundamentally a tuning decision that should remain in the hands of the user 
or application, not the kernel. 
   
   Additionally, this PR lacks sufficient data-driven justification. 
   Without detailed performance metrics—such as where ORCA is slower for 
"simple" queries, the magnitude of the performance gap due to the bug below: 
   ```sql
   gpadmin=# explain(analyze) insert into test values(1);
                                                 QUERY PLAN
   
------------------------------------------------------------------------------------------------------
    Insert on test  (cost=0.00..0.01 rows=1 width=4) (actual time=0.298..0.300 
rows=0 loops=1)
      ->  Result  (cost=0.00..0.00 rows=1 width=8) (actual time=0.017..0.019 
rows=1 loops=1)
            ->  Result  (cost=0.00..0.00 rows=1 width=4) (actual 
time=0.016..0.017 rows=1 loops=1)
                  ->  Result  (cost=0.00..0.00 rows=1 width=1) (actual 
time=0.007..0.008 rows=1 loops=1)
    Planning Time: 29.739 ms
      (slice0)    Executor memory: 111K bytes (seg1).
    Memory used:  128000kB
    Optimizer: Pivotal Optimizer (GPORCA)
    Execution Time: 2.259 ms
   (9 rows)
   
   Time: 35.444 ms
   ```
   While PG planner:
   
   ```sql
   gpadmin=# explain(analyze) insert into test values(1);
                                            QUERY PLAN
   
--------------------------------------------------------------------------------------------
    Insert on test  (cost=0.00..0.03 rows=0 width=0) (actual time=0.082..0.083 
rows=0 loops=1)
      ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.004..0.005 
rows=1 loops=1)
    Planning Time: 0.473 ms
      (slice0)    Executor memory: 110K bytes (seg1).
    Memory used:  128000kB
    Optimizer: Postgres query optimizer
    Execution Time: 1.308 ms
   (7 rows)
   
   Time: 4.604 ms
   ```
   A simple INSERT, but ORCA introduces additional 2 RESULT nodes that slow it 
down. 
   
   **Instead of disabling ORCA, we should focus on addressing these 
inefficiencies directly**. 
   This would not only resolve the immediate issue but also improve ORCA’s 
performance for all users, not just those running "simple" queries.
   Rather than sidestepping the problem, we should tackle the root cause 
head-on. 
   By identifying and fixing the specific inefficiencies in ORCA, you could 
deliver a more robust and user-friendly solution that benefits everyone in the 
long term.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Do not call gporca for simple queries [cloudberry]

Reply via email to