avamingli commented on PR #900:
URL: https://github.com/apache/cloudberry/pull/900#issuecomment-2644851611
Hi, thanks for your pointing it out, but I have significant concerns about
this approach and implications.
Defining what constitutes a "simple" query is inherently subjective and
context-dependent.
The current implementation attempts to formalize this, but it feels overly
simplistic and overlooks critical cases like subqueries.The choice of optimizer
is fundamentally a tuning decision that should remain in the hands of the user
or application, not the kernel.
Additionally, this PR lacks sufficient data-driven justification.
Without detailed performance metrics—such as where ORCA is slower for
"simple" queries, the magnitude of the performance gap due to the bug below:
```sql
gpadmin=# explain(analyze) insert into test values(1);
QUERY PLAN
------------------------------------------------------------------------------------------------------
Insert on test (cost=0.00..0.01 rows=1 width=4) (actual time=0.298..0.300
rows=0 loops=1)
-> Result (cost=0.00..0.00 rows=1 width=8) (actual time=0.017..0.019
rows=1 loops=1)
-> Result (cost=0.00..0.00 rows=1 width=4) (actual
time=0.016..0.017 rows=1 loops=1)
-> Result (cost=0.00..0.00 rows=1 width=1) (actual
time=0.007..0.008 rows=1 loops=1)
Planning Time: 29.739 ms
(slice0) Executor memory: 111K bytes (seg1).
Memory used: 128000kB
Optimizer: Pivotal Optimizer (GPORCA)
Execution Time: 2.259 ms
(9 rows)
Time: 35.444 ms
```
While PG planner:
```sql
gpadmin=# explain(analyze) insert into test values(1);
QUERY PLAN
--------------------------------------------------------------------------------------------
Insert on test (cost=0.00..0.03 rows=0 width=0) (actual time=0.082..0.083
rows=0 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.004..0.005
rows=1 loops=1)
Planning Time: 0.473 ms
(slice0) Executor memory: 110K bytes (seg1).
Memory used: 128000kB
Optimizer: Postgres query optimizer
Execution Time: 1.308 ms
(7 rows)
Time: 4.604 ms
```
A simple INSERT, but ORCA introduces additional 2 RESULT nodes that slow it
down.
**Instead of disabling ORCA, we should focus on addressing these
inefficiencies directly**.
This would not only resolve the immediate issue but also improve ORCA’s
performance for all users, not just those running "simple" queries.
Rather than sidestepping the problem, we should tackle the root cause
head-on.
By identifying and fixing the specific inefficiencies in ORCA, you could
deliver a more robust and user-friendly solution that benefits everyone in the
long term.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]