Thanks for your valuable feedback. I agree with your suggestions.

In practice, when there is only one pending table in the system, it is
indeed reasonable to break the quota limit for that table to improve
overall optimizing efficiency. However, since we cannot predict whether
additional tables will enter the Pending state in the future, it is better
to make the option to whether allow exceeding the quota limit configurable.

The procedure for optimizer thread to poll a task  is as follows: each
thread, following the scheduling order, selects the table with the highest
scheduling priority from the tableQueue and polls optimizing tasks from the
taskQueue of that table’s optimizing process. If the selected table has
reached its optimizing resource limit, the thread proceeds to the next
pending table.

In a scenario where  the number of  the pending optimizing tasks of a table
exceeds its quota, but there are still idle optimizer resources available.
In this case, some optimizing threads may traverse all pending tables and
pull a null task, even though there are still pending optimizing tasks.
This occurs because the tables to which these tasks belong have already
reached their resource quota limits.

To address this issue, when quota overcommitment is enabled, the system can
allow idle optimizer threads to break the per-table quota limit and execute
the optimizing task. This approach ensures that, in scenarios where there
is only a single pending table or where certain tables have low quota
settings, all optimizer resources can be fully utilized.

Thank you for pointing this out. I recognize the importance of your
suggestion and will make the appropriate changes.

Best,
Xixi Chen

Qishang Zhong <zhongqish...@gmail.com> 于2025年7月23日周三 17:08写道:

> Hi.
>
> Thanks for starting this thread.
>
> I agree with Jinsong's point of view, a percentage might be more
> appropriate.
>
> In one case, I have only one pending table, and I have generated tasks that
> exceed the quota. Can I break the quota limit?
>
> Best,
> Qishang Zhong
>
> 陈xx <cxxiii8...@gmail.com> 于2025年7月22日周二 19:14写道:
>
> > Thanks for your suggestion.
> >
> > Indeed, if we set the default quota for tables to  a fixed value, it is
> > still possible to lead to a situation where a single table monopolizes
> all
> > optimizer resources when there are very few optimizer resources.
> > Conversely, when optimizer resources are abundant, it may result in
> > consistently low optimizing efficiency for the table having multiple
> > self-optimizing tasks, and cause inefficient utilization of optimizer
> > resources. Therefore, setting the quota as a percentage to all available
> > optimizer resources is a more appropriate approach. I will proceed with
> the
> > corresponding modifications accordingly.
> >
> > Best,
> > XixiChen
> >
> > Jinsong Zhou <jinsongz...@apache.org> 于2025年7月22日周二 18:01写道:
> >
> > > Hi,
> > >
> > > Thanks for bringing up this improvement.
> > >
> > > This improvement is indeed valuable. In my production practice, we once
> > > encountered a situation where a single table suddenly consumed all
> > > resources, causing all other tables to enter a pending status. We need
> > the
> > > capability to limit the maximum resources a single table can use.
> > >
> > > However, I'd like to discuss how to set the default quota for tables. I
> > > believe in most cases, users won't configure individual quotas for each
> > > table, making default quotas particularly important. Rather than using
> a
> > > fixed value, a percentage might be more appropriate - for example, 50%,
> > > indicating a table can only consume half of the entire group's
> resources.
> > >
> > > Best,
> > > Jinsong
> > >
> > >
> > >
> > > On Tue, Jul 22, 2025 at 5:52 PM 陈xx <cxxiii8...@gmail.com> wrote:
> > >
> > > > Hi devs:
> > > >
> > > > We would like to start a discussion about AIP-1: Optimizing
> Allocation
> > > and
> > > > Schedule Priority of Optimizer resources for Tables[1].
> > > >
> > > > An optimizer group comprises a collection of optimizers, where each
> > > > optimizer instance typically contains multiple threads, with each
> > > optimizer
> > > > thread responsible for executing a single optimizing task. When
> > multiple
> > > > self-optimizing tasks are pending and optimizer resources are
> limited,
> > > > tasks originating from the same table may monopolize all available
> > > > resources in the absence of proper constraints.
> > > > So, we propose to optimize allocation and schedule priority of
> > > > optimizer resources for tables.
> > > > Looking forward to hearing from you.
> > > >
> > > > [1] https://cwiki.apache.org/confluence/x/bQ5JFg
> > > >
> > > > Best
> > > > XixiChen
> > > >
> > >
> >
>

Reply via email to