good!

Leonard(Lifeng Nie) <nielif...@apache.org> 于2024年12月4日周三 10:56写道:

> The design looks good to me.
>
> But the picture you provided doesn't seem to display properly.
>
> Jia Fan <fanjia1...@gmail.com> 于2024年12月4日周三 09:45写道:
>
> > Thanks shenghang!
> > The design looks good to me.
> >
> > zhangshenghang <shengh...@apache.org> 于2024年12月3日周二 20:52写道:
> >
> > > Hi Seatunnel member,
> > >
> > > I would like to discuss the optimization plan for the Seatunnel engine
> > > task scheduling strategy:
> > >
> > > Currently, our task slot allocation strategy is: Random.
> > >
> > > We plan to add two new scheduling strategies:
> > >
> > >    1.
> > >
> > >    SLOT_RATIO
> > >    2.
> > >
> > >    SYSTEM_LOAD
> > >
> > > Detailed PlanSLOT_RATIO
> > >
> > > This strategy schedules based on the usage rate of the worker's slots.
> > > Slots with lower usage rates will have higher priority.
> > >
> > > *Calculation Logic*:
> > >
> > >    1.
> > >
> > >    Obtain the total number of worker slots.
> > >    2.
> > >
> > >    Get the number of unallocated slots.
> > >    3.
> > >
> > >    Usage rate = (Total slots - Unallocated slots) / Total slots.
> > >
> > > SYSTEM_LOAD
> > >
> > > *Weight Distribution and Calculation Explanation*
> > >
> > >    -
> > >
> > >    *Time Weight Design*: The time weight distribution is 4, 2, 2, 1, 1,
> > >    and it can be normalized to maintain consistency in the total. The
> > weight
> > >    for each time period is calculated as:
> > >    [image: image.png]
> > >
> > >
> > >    -
> > >
> > >       The weight for the most recent time is 0.4, 0.2 for three minutes
> > >       ago, and so on.
> > >       -
> > >
> > >    *CPU and Memory Resource Contribution*: The CPU and memory
> utilization
> > >    rates are combined with their respective weights to calculate the
> > >    credibility of the system resource utilization. The formula is:
> > >    [image: image.png]
> > >
> > >    -
> > >
> > >    *Time Decay Factor*: The comprehensive resource utilization rate is
> > >    multiplied by the corresponding time weight after each calculation
> to
> > >    obtain a time-weighted average.
> > >
> > > *Overall Scheduling Formula* The calculation formula for the overall
> > > scheduling priority is integrated as follows:
> > >
> > > [image: image.png]
> > > [image: image.png]
> > > *Implementation Logic*
> > >
> > >    -
> > >
> > >    *Data Collection*:
> > >    -
> > >
> > >       Collect CPU and memory utilization every 3 minutes, storing the
> > >       last 5 statistics.
> > >       -
> > >
> > >       Each time collection binds the data to the corresponding time
> > >       weight.
> > >       -
> > >
> > >    *Priority Calculation*:
> > >    -
> > >
> > >       Based on the collected CPU and memory utilization, calculate the
> > >       scheduling priority for each instance using the formula.
> > >       -
> > >
> > >       Use the calculated result as the core basis for load
> distribution.
> > >       -
> > >
> > >    *Dynamic Adjustment*:
> > >    -
> > >
> > >       Use a sliding window to update the most recent 5 statistics.
> > >       -
> > >
> > >       Reduce the weight of older data to better adapt to the latest
> load
> > >       changes.
> > >
> > > *Example Data Calculation*
> > >
> > >    -
> > >
> > >    Assume the CPU and memory utilization rates for 5 instances are as
> > >    follows:
> > >    [image: image.png]
> > >    -
> > >
> > >    The CPU and memory weight configurations are both 0.5, and the time
> > >    weights are [0.4, 0.2, 0.2, 0.1, 0.1].
> > >    -
> > >
> > >    The corresponding scheduling priority is calculated as:
> > >
> > >    [image: image.png]
> > >
> > >    -
> > >
> > >    The final result is the scheduling priority value, which can be used
> > >    for load distribution.
> > >
> > > Looking forward to your suggestions.
> > >
> > > You can also discuss it in the issue:
> > > https://github.com/apache/seatunnel/issues/8205
> > >
> > >
> > >
> > > Regards,
> > > Jast (Shenghang)
> > >
> >
>
>
> --
> Warm Regards,
>
> Leonard(LiFeng Nie)
>

Reply via email to