> understand a lot of internal implementation details).
> > Maybe
> > > > it
> > > > > > > could
> > > > > > > >>>> be a
> > > > > > > >>>>>>> future improvement (if it's worthw
t; > > >>>>>> be
> > > > > > >>>>>>> scheduled in topological order, so the large table side
> can
> > > > only
> > > > > be
> > > > > > >>>>>>> scheduled after the Ru
> > > > >>>>>>> relevant experience on it, welcome to give some
> suggestions.
> > > > > >>>>>>>
> > > > > >>>>>>>>> What's the representation of the runtime filter node in
> > >
>>>>>>>>
> > > > >>>>>>>>> Schedule the TableSource(dim) first.
> > > > >>>>>>>>
> > > > >>>>>>>> How does it know to schedule the TableSource(dim) first ?
> IMO,
> > > In
; >>>>>>>> Nice to see this valuable feature. After reading the FLIP I
> > have
> > > > >>>> some
> > > > >>>>>>>> questions below:
> > > > >>>>>>>>
> > > > >>>>>
;>> value. The same to
> > > >> table.optimizer.runtime-filter.max-build-data-size
> > > >>>>>>>>
> > > >>>>>>>>> the runtime filter can be pushed down along the probe side,
> as
> > > >&g
it would be reasonable to
> > also
> > >>>>>>>> support
> > >>>>>>>>> "pipeline shuffle" if possible.
> > >>>>>>>>>
> > >>>>>>>>> "p
ern that "Even if the RuntimeFilter becomes
> >> running
> >>>>>>>> before
> >>>>>>>>> the RuntimeFilterBuilder finished, it will not process any data
> and
> >>>>>> will
> >>>>>>&g
>>>>>>> [1]
>>>>>>
>>>>
>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.google.com/url?q%253Dhttps://www.google.com/url?q%25253Dhttps://issues.apache.org/jira/browse/FLINK-25318%252526source%25253Dgmail-imap%
gt;
> >>>>>>> Lijie Wang wangdachui9...@gmail.com> >> wangdachui9...@gmail.com <mailto:wangdachui9...@gmail.com>> wangdachui9...@gmail.com>>
> >>>> 于2023年6月15日周四 14:18写道:
> >>>>>>>
>
>> between
>>>>>>>> RuntimeFilterBuilder and RuntimeFilter to be BLOCKING(regardless of
>>>>>> which
>>>>>>>> BatchShuffleMode is set). Because the RuntimeFilter really doesn’t
>>>> need
>>>>>>> to
>
;>
> >>>>>> Lijie Wang wangdachui9...@gmail.com> <mailto:wangdachui9...@gmail.com>>
> >> 于2023年6月15日周四 09:48写道:
> >>>>>>
> >>>>>>> Hi Yuxia,
> >>>>>>>
> >>>>>>>
available, we will
>>>>> not
>>>>>>> inject a runtime filter(As you said, we can hardly evaluate the
>>>>>> benefits).
>>>>>>> Besides, AFAIK, the estimated data size of build side is also based
>>>> on
>>>>>&
;> phase) -> No filter
> >>>>> Estimated data size meets the requirement (in planner optimization
> >>>> phase),
> >>>>> but the real data size does not meet the requirement(in execution
> >>> phase)
> >>>> ->
.@alumni.sjtu.edu.cn>>
>>>>> 于2023年6月14日周三 20:37写道:
>>>>>
>>>>>> Thanks Lijie for starting this discussion. Excited to see runtime
>>> filter
>>>>>> is to be implemented in Flink.
>>>>>> I have few ques
unt
> > > >> instead`. So, does row count comes from the statistic from
> underlying
> > > >> table? What if the the statistic is also unavailable considering
> users
> > > >> maynot always remember to generate statistic in production.
> > > >> I'
t; benefits of runtime-filter.
> > >>
> > >>
> > >> 2: The FLIP said: "We will inject the runtime filters only if the
> > >> following requirements are met:xxx", but it also said, "Once this
> limit
> > is
> > >> exceed
We will inject the runtime filters only if the
> >> following requirements are met:xxx", but it also said, "Once this limit
> is
> >> exceeded, it will output a fake filter(which always returns true)" in
> >> `RuntimeFilterBuilderOperator` part; Seems they ar
t;> 3: Does it also mean runtime-filter can only take effect in blocking
>> shuffle?
>>
>>
>>
>> Best regards,
>> Yuxia
>>
>> - 原始邮件 -
>> 发件人: "ron9 liu"
>> 收件人: "dev"
>> 发送时间: 星期三, 2023年 6 月 14日
; 3: Does it also mean runtime-filter can only take effect in blocking
> shuffle?
>
>
>
> Best regards,
> Yuxia
>
> ----- 原始邮件 -
> 发件人: "ron9 liu"
> 收件人: "dev"
> 发送时间: 星期三, 2023年 6 月 14日 下午 5:29:28
> 主题: Re: [DISCUSS] FLIP-324: Introduce
'm
wondering what's the real behavior, no filter will be injected or fake filter?
3: Does it also mean runtime-filter can only take effect in blocking shuffle?
Best regards,
Yuxia
- 原始邮件 -
发件人: "ron9 liu"
收件人: "dev"
发送时间: 星期三, 2023年 6 月 14日 下午 5:29:28
主题: Re: [DIS
Thanks Lijie start this discussion. Runtime Filter is a common optimization
to improve the join performance that has been adopted by many computing
engines such as Spark, Doris, etc... Flink is a streaming batch computing
engine, and we are continuously optimizing the performance of batches.
Hi devs
Ron Liu, Gen Luo and I would like to start a discussion about FLIP-324:
Introduce Runtime Filter for Flink Batch Jobs[1]
Runtime Filter is a common optimization to improve join performance. It is
designed to dynamically generate filter conditions for certain Join queries
at runtime to
23 matches
Mail list logo