Re: [DISCUSS] FLIP-370 : Support Balanced Tasks Scheduling

2023-10-10 Thread Yuepeng Pan
Hi, xiangyu,
Thanks for your attention as well.

>1, About the waiting mechanism: Will the waiting mechanism happen only in
>the second level 'assigning slots to TM'? IIUC, the first level 'assigning
>Tasks to Slots' needs only the asynchronous slot result from slotpool.

As described in the latest FLIP, the introduction of the waiting mechanism at 
the second level is to ensure that, in all deployment modes such as 
application, session, etc., we do not fall into a local greedy state when 
selecting the optimal slot position. This requires obtaining a global resource 
view to get the best result.
IIUC, The allocation process from Task to Slot is the generation of a mapping 
relationship between two abstract descriptions, and at this point, there is no 
coupling of information between tasks/slots and Task Managers (TMs). 


>2, About the slot LoadingWeight: it is reasonable to use the number of
>tasks by default in the beginning, but it would be better if this could be
>easily extended in future to distinguish between CPU-intensive and
>IO-intensive workloads. In some cases, TMs may have IO bottlenecks but
>others have CPU bottlenecks.

Nice feedback. In fact, as mentioned in the Google Doc, the LoadingWeight 
interface currently only includes a description of the number of tasks. So, 
IIUC, If there is a need to further expand descriptions of other resource 
loads, we just extend it based on the current interface and its 
implementations, right?
Please correct me if I have misunderstood. Thanks a lot~

Best,
Yuepeng.

On 2023/10/06 10:19:21 xiangyu feng wrote:
> Thanks Yuepeng and Rui for driving this Discussion.
> 
> Internally when we try to use Flink 1.17.1 in production, we are also
> suffering from the unbalanced task distribution problem for jobs with high
> qps and complex dag. So +1 for the overall proposal.
> 
> Some questions about the details:
> 
> 1, About the waiting mechanism: Will the waiting mechanism happen only in
> the second level 'assigning slots to TM'?  IIUC, the first level 'assigning
> Tasks to Slots' needs only the asynchronous slot result from slotpool.
> 
> 2, About the slot LoadingWeight: it is reasonable to use the number of
> tasks by default in the beginning, but it would be better if this could be
> easily extended in future to distinguish between CPU-intensive and
> IO-intensive workloads. In some cases, TMs may have IO bottlenecks but
> others have CPU bottlenecks.
> 
> Regards,
> Xiangyu
> 
> 
> Yuepeng Pan  于2023年10月5日周四 18:34写道:
> 
> > Hi, Zhu Zhu,
> >
> > Thanks for your feedback!
> >
> > > I think we can introduce a new config option
> > > `taskmanager.load-balance.mode`,
> > > which accepts "None"/"Slots"/"Tasks". `cluster.evenly-spread-out-slots`
> > > can be superseded by the "Slots" mode and get deprecated. In the future
> > > it can support more mode, e.g. "CpuCores", to work better for jobs with
> > > fine-grained resources. The proposed config option
> > > `slot.request.max-interval`
> > > then can be renamed to
> > `taskmanager.load-balance.request-stablizing-timeout`
> > > to show its relation with the feature. The proposed
> > `slot.sharing-strategy`
> > > is not needed, because the configured "Tasks" mode will do the work.
> >
> > The new proposed configuration option sounds good to me.
> >
> > I have a small question, If we set our configuration value to 'Tasks,' it
> > will initiate two processes: balancing the allocation of task quantities at
> > the slot level and balancing the number of tasks across TaskManagers (TMs).
> > Alternatively, if we configure it as 'Slots,' the system will employ the
> > LocalPreferred allocation policy (which is the default) when assigning
> > tasks to slots, and it will ensure that the number of slots used across TMs
> > is balanced.
> > Does  this configuration essentially combine a balanced selection strategy
> > across two dimensions into fixed configuration items, right?
> >
> > I would appreciate it if you could correct me if I've made any errors.
> >
> > Best,
> > Yuepeng.
> >
> 


[jira] [Created] (FLINK-33236) Remove the unused high-availability.zookeeper.path.running-registry option

2023-10-10 Thread Zhanghao Chen (Jira)
Zhanghao Chen created FLINK-33236:
-

 Summary: Remove the unused 
high-availability.zookeeper.path.running-registry option
 Key: FLINK-33236
 URL: https://issues.apache.org/jira/browse/FLINK-33236
 Project: Flink
  Issue Type: Technical Debt
  Components: Runtime / Configuration
Affects Versions: 1.18.0
Reporter: Zhanghao Chen
 Fix For: 1.19.0


The running registry subcomponent of Flink HA has been removed in 
[FLINK-25430|https://issues.apache.org/jira/browse/FLINK-25430] and the 
"high-availability.zookeeper.path.running-registry" option is of no use after 
that. We should remove the option and regenerate the config doc to remove the 
relevant descriptions to avoid user's confusion. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Re: [DISCUSS] FLIP-373: Support Configuring Different State TTLs using SQL Hint

2023-10-10 Thread Feng Jin
Hi Jane,

Thank you for providing this explanation.

Another small question, since there is no exception thrown when the STATE
hint is set incorrectly,
should we somehow show that the TTL setting has taken effect?
For instance, exhibiting the TTL option within the operator's description?

Best,
Feng

On Tue, Oct 10, 2023 at 7:51 PM Xuyang  wrote:

> Hi, Jane.
>
>
> I think this syntax will be easier for users to set operator ttl. So big
> +1. I left some minor comments here.
>
>
> I notice that using STATE_TTL hints wrongly will not throw any exceptions.
> But it seems that in the current join hint scenario,
> if user uses an unknown table name as the chosen side, a validation
> exception will be thrown.
> Maybe we should distinguish which exceptions need to be thrown explicitly.
>
>
>
>
> --
>
> Best!
> Xuyang
>
>
>
>
>
> At 2023-10-10 18:23:55, "Jane Chan"  wrote:
> >Hi Feng,
> >
> >Thank you for your valuable comments. The reason for not including the
> >scenarios above is as follows:
> >
> >For <1>, the automatically inferred stateful operators are not easily
> >expressible in SQL. This issue was discussed in FLIP-292, and besides
> >ChangelogNormalize, SinkUpsertMateralizer also faces the same problem.
> >
> >For <2> and <3>, the challenge lies in internal implementation. During the
> >default_rewrite phase, the row_number expression in LogicalProject is
> >transformed into LogicalWindow by Calcite's
> >CoreRules#PROJECT_TO_LOGICAL_PROJECT_AND_WINDOW. However, CalcRelSplitter
> >does not pass the hints as an input argument when creating LogicalWindow,
> >resulting in the loss of the hint at this step. To support this, we may
> >need to rewrite some optimization rules in Calcite, which could be a
> >follow-up work if required.
> >
> >Best,
> >Jane
> >
> >On Tue, Oct 10, 2023 at 1:40 AM Feng Jin  wrote:
> >
> >> Hi Jane,
> >>
> >> Thank you for proposing this FLIP.
> >>
> >> I believe that this FLIP will greatly enhance the flexibility of setting
> >> state, and by setting different operators' TTL, it can also increase job
> >> stability, especially in regular join scenarios.
> >> The parameter design is very concise, big +1 for this, and it is also
> >> relatively easy to use for users.
> >>
> >>
> >> I have a small question: in the FLIP, it only mentions join and group.
> >> Should we also consider other scenarios?
> >>
> >> 1. the auto generated deduplicate operator[1].
> >> 2. the deduplicate query[2].
> >> 3. the topN query[3].
> >>
> >> [1]
> >>
> >>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-source-cdc-events-duplicate
> >> [2]
> >>
> >>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/deduplication/
> >> [3]
> >>
> >>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/topn/
> >>
> >>
> >> Best,
> >> Feng
> >>
> >> On Sun, Oct 8, 2023 at 5:53 PM Jane Chan  wrote:
> >>
> >> > Hi devs,
> >> >
> >> > I'd like to initiate a discussion on FLIP-373: Support Configuring
> >> > Different State TTLs using SQL Hint [1]. This proposal is on top of
> the
> >> > FLIP-292 [2] to address typical scenarios with unambiguous semantics
> and
> >> > hint propagation.
> >> >
> >> > I'm looking forward to your opinions!
> >> >
> >> >
> >> > [1]
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-373%3A+Support+Configuring+Different+State+TTLs+using+SQL+Hint
> >> > [2]
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-292%3A+Enhance+COMPILED+PLAN+to+support+operator-level+state+TTL+configuration
> >> >
> >> > Best,
> >> > Jane
> >> >
> >>
>


Re: [DISCUSS] FLIP-370 : Support Balanced Tasks Scheduling

2023-10-10 Thread Yuepeng Pan
Hi, David, 
Thank you very much for your attention.

>The problem you're trying to solve only exists in complex graphs with
>different per-vertex parallelism. If the parallelism is set globally
>(assuming the pipeline has roughly even data skew), the algorithm could
>make things slightly worse by eliminating some local exchanges. Is that
>correct? 

Your understanding is accurate, and it's undeniable that such use case 
scenarios exist.

>Where I'm headed with this is that there could be a hybrid strategy that
>provides a reasonable default when the pipeline uses slot-sharing (for
>per-vertex parallelism, use the new strategy; for global parallelism use
>the old one). It's always a shame if improvements like this end up being a
>power-user feature and very few workloads benefit from it. Any thoughts? 

The concept of letting the engine determine the scheduling strategy based on a 
predefined rule is excellent. This approach aims to maximize job performance 
while minimizing user intervention.
It might not need to rush into implementing this rule at this moment. What I 
mean is, we can evaluate and develop a well-founded rule in future work. 
Nonetheless, we can still consider this rule in advance so that it can be 
validated after the feature's release.
Additionally, if we decide to implement this rule in the future, it should be 
introduced as a switch. As you pointed out, we currently don't take data 
characteristics' impact on task resource allocation in the actual environment 
into account. Therefore, implementing it as a switch will offer users greater 
flexibility. Of course, it will add a little complexity to users' understanding 
of this parameter.

I'm also eager to hear from other contributors regarding it and looking forward 
to your reply.

Best,
Yuepeng.

On 2023/10/02 20:37:12 David Morávek wrote:
> Hello Yuepeng,
> 
> The FLIP reads sane; nice work! To re-phrase my understanding:
> 
> The problem you're trying to solve only exists in complex graphs with
> different per-vertex parallelism. If the parallelism is set globally
> (assuming the pipeline has roughly even data skew), the algorithm could
> make things slightly worse by eliminating some local exchanges. Is that
> correct?
> 
> Where I'm headed with this is that there could be a hybrid strategy that
> provides a reasonable default when the pipeline uses slot-sharing (for
> per-vertex parallelism, use the new strategy; for global parallelism use
> the old one). It's always a shame if improvements like this end up being a
> power-user feature and very few workloads benefit from it. Any thoughts?
> 
> Best,
> D.
> 
> On Sun, Oct 1, 2023 at 1:38 PM Yangze Guo  wrote:
> 
> > Hi, Rui,
> >
> > 1. With the current mechanism, when physical slots are offered from
> > TM, the JobMaster will start deploying tasks and synchronizing their
> > states. With the addition of the waiting mechanism, IIUC, the
> > JobMaster will deploy and synchronize the states of all tasks only
> > after all resources are available. The task deployment and state
> > synchronization both occupy the JobMaster's RPC main thread. In
> > complex jobs with a lot of tasks, this waiting mechanism may increase
> > the pressure on the JobMaster and increase the end-to-end job
> > deployment time.
> >
> > 2. From my understanding, if user enable the
> > cluster.evenly-spread-out-slots,
> > LeastUtilizationResourceMatchingStrategy will be used to determine the
> > slot distribution and the slot allocation in the three TM will be
> > (taskmanager.numberOfTaskSlots=3):
> > TM1: 3 slot
> > TM2: 2 slot
> > TM3: 2 slot
> >
> > Best,
> > Yangze Guo
> >
> > On Sun, Oct 1, 2023 at 6:14 PM Rui Fan <1996fan...@gmail.com> wrote:
> > >
> > > Hi Shammon,
> > >
> > > Thanks for your feedback as well!
> > >
> > > > IIUC, the overall balance is divided into two parts: slot to TM and
> > task
> > > to slot.
> > > > 1. Slot to TM is guaranteed by SlotManager in ResourceManager
> > > > 2. Task to slot is guaranteed by the slot pool in JM
> > > >
> > > > These two are completely independent, what are the benefits of unifying
> > > > these two into one option? Also, do we want to share the same
> > > > option between SlotPool in JM and SlotManager in RM? This sounds a bit
> > > > strange.
> > >
> > > Your understanding is totally right, the balance needs 2 parts: slot to
> > TM
> > > and task to slot.
> > >
> > > As I understand, the following are benefits of unifying them into one
> > > option:
> > >
> > > - Flink users don't care about these principles inside of flink, they
> > don't
> > > know these 2 parts.
> > > - If flink provides 2 options, flink users need to set 2 options for
> > their
> > > job.
> > > - If one option is missed, the final result may not be good. (Users may
> > > have questions when using)
> > > - If flink just provides 1 option, enabling one option is enough. (Reduce
> > > the probability of misconfiguration)
> > >
> > > Also, Flink’s options are user-oriented. Each option 

[jira] [Created] (FLINK-33235) Quickstart guide for Flink OLAP should support building from master branch

2023-10-10 Thread xiangyu feng (Jira)
xiangyu feng created FLINK-33235:


 Summary: Quickstart guide for Flink OLAP should support building 
from master branch
 Key: FLINK-33235
 URL: https://issues.apache.org/jira/browse/FLINK-33235
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: xiangyu feng


Many features required by OLAP session cluster are still in master branch or 
in-progress and not released yet, for example: FLIP-295, FLIP-362, FLIP-374. We 
need to address this in the document and show users how to quickly build OLAP 
Session Cluster from master branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-374: Adding a separate configuration for specifying Java Options of the SQL Gateway

2023-10-10 Thread Zakelly Lan
+1(non-binding)

Thanks for driving this.

Best,
Zakelly

On Wed, Oct 11, 2023 at 11:22 AM Weihua Hu  wrote:
>
> +1(binding)
>
> Best,
> Weihua
>
>
> On Wed, Oct 11, 2023 at 10:56 AM xiangyu feng  wrote:
>
> > +1(non-binding)
> >
> > Regards,
> > Xiangyu
> >
> > Shammon FY  于2023年10月11日周三 10:30写道:
> >
> > > +1(binding), good job!
> > >
> > > Best,
> > > Shammon FY
> > >
> > > On Wed, Oct 11, 2023 at 10:18 AM Benchao Li 
> > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > Rui Fan <1996fan...@gmail.com> 于2023年10月11日周三 10:17写道:
> > > > >
> > > > > +1(binding)
> > > > >
> > > > > Best,
> > > > > Rui
> > > > >
> > > > > On Wed, Oct 11, 2023 at 10:07 AM Yangze Guo 
> > > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I'd like to start the vote of FLIP-374 [1]. This FLIP is discussed
> > in
> > > > > > the thread [2].
> > > > > >
> > > > > > The vote will be open for at least 72 hours. Unless there is an
> > > > > > objection, I'll try to close it by October 16, 2023 if we have
> > > > > > received sufficient votes.
> > > > > >
> > > > > > [1]
> > > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-374%3A+Adding+a+separate+configuration+for+specifying+Java+Options+of+the+SQL+Gateway
> > > > > > [2]
> > https://lists.apache.org/thread/g4vl8mgnwgl7vjyvjy6zrc8w54b2lthv
> > > > > >
> > > > > > Best,
> > > > > > Yangze Guo
> > > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Best,
> > > > Benchao Li
> > > >
> > >
> >


Re: FW: RE: [DISCUSS] FLIP-368 Reorganize the exceptions thrown in state interfaces

2023-10-10 Thread Zakelly Lan
Hi David,

Thanks for your response.

The exceptions thrown by state interfaces are NOT retriable. For
example, there may be some elements sent to the wrong subtask due to a
non-deterministic hashCode() algorithm and the key group is not
matching. Or the rocksdb may fail to read a file if it has been
deleted by the user. If there are future implementations that are
worth retrying (such as network access), it would be better to let the
implementation itself handle the retries and provide a configuration
for this, rather than requiring users to catch these exceptions.

Regarding the release and documentation, I have mentioned that this
change is targeted for version 1.19 with proper documentation. You may
have noticed that state interfaces are annotated with @PublicEvolving,
which means these interfaces may change across versions. The changes
are suitable for a minor release (1.18.0 currently to 1.19.0 in the
future) as defined by the API compatibility guarantees of Flink[1].



Best,
Zakelly


[1] 
https://nightlies.apache.org/flink/flink-docs-master/docs/ops/upgrading/#api-compatibility-guarantees

On Tue, Oct 10, 2023 at 6:19 PM David Radley  wrote:
>
> Hi,
> I notice 
> https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/api/common/state/ValueState.html
>  is an external API. I am concerned that this change will break existing 
> applications using the old interface, they are likely to have catches / 
> throws around the existing checked Exceptions.
>
> If we go with RunTimeException, I would suggest that this sort of breaking 
> change should be done on a Flink version change, where it is appropriate to 
> make breaking changes to the API with associated documentation.
>
> If we want this change on a minor release,  we could create a new class 
> ValueState2– that is used internally with the cleaned up Exceptions, but 
> still expose the old class and Exceptions for existing external applications. 
> I guess new applications could use the new ValueState2 .
>
> What do you think?
> Kind regards, David.
>
>
> From: David Radley 
> Date: Tuesday, 10 October 2023 at 09:49
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] RE: [DISCUSS] FLIP-368 Reorganize the exceptions thrown 
> in state interfaces
> Hi ,
> The argument seems to be that the errors cannot be acted on so should be 
> runtime exceptions. I want to confirm that none of these errors could / 
> should be retriable. If there is a possibility that the state is available at 
> some time later then I assume a checked retriable Exception would be 
> appropriate for those cases; and be part of the contract with the caller. Can 
> we be sure that there is no possibility that the state will become available; 
> if so then I agree that a runtime Exception is appropriate. What do you think?
>
>
>
> Kind regards, David.
>
>
> From: Zakelly Lan 
> Date: Monday, 9 October 2023 at 18:12
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] Re: [DISCUSS] FLIP-368 Reorganize the exceptions thrown 
> in state interfaces
> Hi everyone,
>
> It seems we're gradually reaching a consensus. So I would like to
> start a vote after 72 hours if there are no further discussions.
>
> Please let me know if you have any concerns, thanks!
>
>
> Best,
> Zakelly
>
>
> On Sat, Oct 7, 2023 at 4:07 PM Zakelly Lan  wrote:
> >
> > Hi Jing,
> >
> > Sorry for the late reply! I agree with you that we do not expect users
> > to do anything with Flink and we won't "bother" them with those
> > exceptions. However, users can still catch the `Throwable` and perform
> > any necessary logging activities, similar to how they use Java
> > Collection interfaces.
> >
> >
> > Thanks for your insights!
> >
> > Best,
> > Zakelly
> >
> > On Thu, Sep 21, 2023 at 8:43 PM Jing Ge  wrote:
> > >
> > > Fair enough! Thanks Zakelly for the information. Afaic, even users can do
> > > nothing with Flink, they still can do something in their territory, at
> > > least doing some logging and metrics stuff, or triggering some other
> > > services in their ecosystem. After all, the Flink jobs they build are part
> > > of their service component. It didn't change the fact that we are going to
> > > use the anti-pattern. Just because we didn't expect users to do
> > > anything with Flink, does not mean users don't expect to do something with
> > > the expected exception. Anyway, I am open to hearing different opinions.
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Thu, Sep 21, 2023 at 7:02 AM Zakelly Lan  wrote:
> > >
> > > > Hi Martijn,
> > > >
> > > > Thanks for the reminder!
> > > >
> > > > This FLIP proposes a change to the state API that is annotated as
> > > > @PublicEvolving and targets version 1.19.  I have clarified this in
> > > > the "Proposed Change" section of the FLIP.
> > > >
> > > >
> > > > Hi Jing,
> > > >
> > > > Thanks for sharing your thoughts! Here are my opinions:
> > > >
> > > > 1. The exceptions of the state API are usually treated as critical
> > > > ones. In 

Re: [VOTE] FLIP-374: Adding a separate configuration for specifying Java Options of the SQL Gateway

2023-10-10 Thread Weihua Hu
+1(binding)

Best,
Weihua


On Wed, Oct 11, 2023 at 10:56 AM xiangyu feng  wrote:

> +1(non-binding)
>
> Regards,
> Xiangyu
>
> Shammon FY  于2023年10月11日周三 10:30写道:
>
> > +1(binding), good job!
> >
> > Best,
> > Shammon FY
> >
> > On Wed, Oct 11, 2023 at 10:18 AM Benchao Li 
> wrote:
> >
> > > +1 (binding)
> > >
> > > Rui Fan <1996fan...@gmail.com> 于2023年10月11日周三 10:17写道:
> > > >
> > > > +1(binding)
> > > >
> > > > Best,
> > > > Rui
> > > >
> > > > On Wed, Oct 11, 2023 at 10:07 AM Yangze Guo 
> > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > I'd like to start the vote of FLIP-374 [1]. This FLIP is discussed
> in
> > > > > the thread [2].
> > > > >
> > > > > The vote will be open for at least 72 hours. Unless there is an
> > > > > objection, I'll try to close it by October 16, 2023 if we have
> > > > > received sufficient votes.
> > > > >
> > > > > [1]
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-374%3A+Adding+a+separate+configuration+for+specifying+Java+Options+of+the+SQL+Gateway
> > > > > [2]
> https://lists.apache.org/thread/g4vl8mgnwgl7vjyvjy6zrc8w54b2lthv
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > >
> > >
> > >
> > > --
> > >
> > > Best,
> > > Benchao Li
> > >
> >
>


Re: [VOTE] FLIP-374: Adding a separate configuration for specifying Java Options of the SQL Gateway

2023-10-10 Thread xiangyu feng
+1(non-binding)

Regards,
Xiangyu

Shammon FY  于2023年10月11日周三 10:30写道:

> +1(binding), good job!
>
> Best,
> Shammon FY
>
> On Wed, Oct 11, 2023 at 10:18 AM Benchao Li  wrote:
>
> > +1 (binding)
> >
> > Rui Fan <1996fan...@gmail.com> 于2023年10月11日周三 10:17写道:
> > >
> > > +1(binding)
> > >
> > > Best,
> > > Rui
> > >
> > > On Wed, Oct 11, 2023 at 10:07 AM Yangze Guo 
> wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > I'd like to start the vote of FLIP-374 [1]. This FLIP is discussed in
> > > > the thread [2].
> > > >
> > > > The vote will be open for at least 72 hours. Unless there is an
> > > > objection, I'll try to close it by October 16, 2023 if we have
> > > > received sufficient votes.
> > > >
> > > > [1]
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-374%3A+Adding+a+separate+configuration+for+specifying+Java+Options+of+the+SQL+Gateway
> > > > [2] https://lists.apache.org/thread/g4vl8mgnwgl7vjyvjy6zrc8w54b2lthv
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >
>


Re: [VOTE] FLIP-374: Adding a separate configuration for specifying Java Options of the SQL Gateway

2023-10-10 Thread Shammon FY
+1(binding), good job!

Best,
Shammon FY

On Wed, Oct 11, 2023 at 10:18 AM Benchao Li  wrote:

> +1 (binding)
>
> Rui Fan <1996fan...@gmail.com> 于2023年10月11日周三 10:17写道:
> >
> > +1(binding)
> >
> > Best,
> > Rui
> >
> > On Wed, Oct 11, 2023 at 10:07 AM Yangze Guo  wrote:
> >
> > > Hi everyone,
> > >
> > > I'd like to start the vote of FLIP-374 [1]. This FLIP is discussed in
> > > the thread [2].
> > >
> > > The vote will be open for at least 72 hours. Unless there is an
> > > objection, I'll try to close it by October 16, 2023 if we have
> > > received sufficient votes.
> > >
> > > [1]
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-374%3A+Adding+a+separate+configuration+for+specifying+Java+Options+of+the+SQL+Gateway
> > > [2] https://lists.apache.org/thread/g4vl8mgnwgl7vjyvjy6zrc8w54b2lthv
> > >
> > > Best,
> > > Yangze Guo
> > >
>
>
>
> --
>
> Best,
> Benchao Li
>


Re: [VOTE] FLIP-374: Adding a separate configuration for specifying Java Options of the SQL Gateway

2023-10-10 Thread Benchao Li
+1 (binding)

Rui Fan <1996fan...@gmail.com> 于2023年10月11日周三 10:17写道:
>
> +1(binding)
>
> Best,
> Rui
>
> On Wed, Oct 11, 2023 at 10:07 AM Yangze Guo  wrote:
>
> > Hi everyone,
> >
> > I'd like to start the vote of FLIP-374 [1]. This FLIP is discussed in
> > the thread [2].
> >
> > The vote will be open for at least 72 hours. Unless there is an
> > objection, I'll try to close it by October 16, 2023 if we have
> > received sufficient votes.
> >
> > [1]
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-374%3A+Adding+a+separate+configuration+for+specifying+Java+Options+of+the+SQL+Gateway
> > [2] https://lists.apache.org/thread/g4vl8mgnwgl7vjyvjy6zrc8w54b2lthv
> >
> > Best,
> > Yangze Guo
> >



-- 

Best,
Benchao Li


Re: [VOTE] FLIP-374: Adding a separate configuration for specifying Java Options of the SQL Gateway

2023-10-10 Thread Rui Fan
+1(binding)

Best,
Rui

On Wed, Oct 11, 2023 at 10:07 AM Yangze Guo  wrote:

> Hi everyone,
>
> I'd like to start the vote of FLIP-374 [1]. This FLIP is discussed in
> the thread [2].
>
> The vote will be open for at least 72 hours. Unless there is an
> objection, I'll try to close it by October 16, 2023 if we have
> received sufficient votes.
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-374%3A+Adding+a+separate+configuration+for+specifying+Java+Options+of+the+SQL+Gateway
> [2] https://lists.apache.org/thread/g4vl8mgnwgl7vjyvjy6zrc8w54b2lthv
>
> Best,
> Yangze Guo
>


[VOTE] FLIP-374: Adding a separate configuration for specifying Java Options of the SQL Gateway

2023-10-10 Thread Yangze Guo
Hi everyone,

I'd like to start the vote of FLIP-374 [1]. This FLIP is discussed in
the thread [2].

The vote will be open for at least 72 hours. Unless there is an
objection, I'll try to close it by October 16, 2023 if we have
received sufficient votes.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-374%3A+Adding+a+separate+configuration+for+specifying+Java+Options+of+the+SQL+Gateway
[2] https://lists.apache.org/thread/g4vl8mgnwgl7vjyvjy6zrc8w54b2lthv

Best,
Yangze Guo


[ANNOUNCE] Release 1.18.0, release candidate #2

2023-10-10 Thread Jing Ge
Hi everyone,

The RC2 for Apache Flink 1.18.0 has been created. The related voting
process will be triggered once the announcement is ready. The RC2 has all
the artifacts that we would typically have for a release, except for the
release note and the website pull request for the release announcement.

The following contents are available for your review:

- Confirmation of no benchmarks regression at the thread[1].
- The preview source release and binary convenience releases [2], which
are signed with the key with fingerprint 96AE0E32CBE6E0753CE6 [3].
- all artifacts that would normally be deployed to the Maven
Central Repository [4].
- source code tag "release-1.18.0-rc2" [5]

Your help testing the release will be greatly appreciated! And we'll
create the voting thread as soon as all the efforts are finished.

[1] https://lists.apache.org/thread/yxyphglwwvq57wcqlfrnk3qo9t3sr2ro
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.18.0-rc2/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1658
[5] https://github.com/apache/flink/releases/tag/release-1.18.0-rc2

Best regards,
Konstantin, Qingsheng, Sergey, and Jing


[jira] [Created] (FLINK-33234) Bump used Guava version in Kafka E2E tests

2023-10-10 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-33234:
--

 Summary: Bump used Guava version in Kafka E2E tests
 Key: FLINK-33234
 URL: https://issues.apache.org/jira/browse/FLINK-33234
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Kafka
Reporter: Martijn Visser
Assignee: Martijn Visser


To resolve existing Dependabot PRs: 
https://github.com/apache/flink-connector-kafka/security/dependabot?q=package%3Acom.google.guava%3Aguava+manifest%3Aflink-connector-kafka-e2e-tests%2Fflink-end-to-end-tests-common-kafka%2Fpom.xml+has%3Apatch



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33233) Null point exception when non-native udf used in join condition

2023-10-10 Thread yunfan (Jira)
yunfan created FLINK-33233:
--

 Summary: Null point exception when non-native udf used in join 
condition
 Key: FLINK-33233
 URL: https://issues.apache.org/jira/browse/FLINK-33233
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Affects Versions: 1.17.0
 Environment: It can reproduced by follow code by adding this test to 
{code:java}
org.apache.flink.connectors.hive.HiveDialectQueryITCase{code}
 
{code:java}
// Add follow code to org.apache.flink.connectors.hive.HiveDialectQueryITCase
@Test
public void testUdfInJoinCondition() throws Exception {
List result = CollectionUtil.iteratorToList(tableEnv.executeSql(
"select foo.y, bar.I from bar join foo on hiveudf(foo.x) = bar.I 
where bar.I > 1").collect());
assertThat(result.toString())
.isEqualTo("[+I[2, 2]]");
} {code}
Reporter: yunfan


Any non-native udf used in hive-parser join condition. 

It will caused NullPointException.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33232) Kubernetive Operator Not Able to Take Other Python paramters While Submitting Job Deployment

2023-10-10 Thread Amarjeet Singh (Jira)
Amarjeet Singh created FLINK-33232:
--

 Summary: Kubernetive Operator Not Able to Take Other Python 
paramters While Submitting Job Deployment
 Key: FLINK-33232
 URL: https://issues.apache.org/jira/browse/FLINK-33232
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.6.0, kubernetes-operator-1.7.0
Reporter: Amarjeet Singh
 Fix For: 1.17.1






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33231) Memory leak in KafkaSourceReader if no data in consumed topic

2023-10-10 Thread Jira
Lauri Suurväli created FLINK-33231:
--

 Summary: Memory leak in KafkaSourceReader if no data in consumed 
topic
 Key: FLINK-33231
 URL: https://issues.apache.org/jira/browse/FLINK-33231
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.17.1
Reporter: Lauri Suurväli
 Attachments: Screenshot 2023-10-09 at 13.02.34.png, Screenshot 
2023-10-09 at 14.13.43.png, Screenshot 2023-10-10 at 12.49.37.png

*Problem description*

Our Flink streaming job TaskManager heap gets full when the job has nothing to 
consume and process.

It's a simple streaming job with KafkaSource -> ProcessFunction -> KafkaSink. 
When there are no messages in the source topic the TaskManager heap usage 
starts increasing until the job exits after receiving a SIGTERM signal. We are 
running the job on AWS EMR with YARN.

The problems with the TaskManager heap usage do not occur when there is data to 
process. It's also worth noting that sending a single message to the source 
topic of a streaming job that has been sitting idle and suffers from the memory 
leak will cause the heap to be cleared. However it does not resolve the problem 
since the heap usage will start increasing immediately after processing the 
message.

!Screenshot 2023-10-10 at 12.49.37.png!

TaskManager heap used percentage is calculated by 

 
{code:java}
flink.taskmanager.Status.JVM.Memory.Heap.Used * 100 / 
flink.taskmanager.Status.JVM.Memory.Heap.Max{code}
 

 

 I was able to take heap dumps of the TaskManager processes during a high heap 
usage percentage. Heap dump analysis detected 912,355 instances of 
java.util.HashMap empty collections retaining >= 43,793,040 bytes.

!Screenshot 2023-10-09 at 14.13.43.png!

The retained heap seemed to be located at:

 
{code:java}
org.apache.flink.connector.kafka.source.reader.KafkaSourceReader#offsetsToCommit{code}
 

!Screenshot 2023-10-09 at 13.02.34.png!

 

*Possible hints:*

An empty HashMap is added during the snapshotState method to offsetsToCommit 
map if it does not already exist for the given checkpoint. [KafkaSourceReader 
line 
107|https://github.com/apache/flink-connector-kafka/blob/b09928d5ef290f2a046dc1fe40b4c5cebe76f997/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/KafkaSourceReader.java#L107]

 
{code:java}
Map offsetsMap =
offsetsToCommit.computeIfAbsent(checkpointId, id -> new HashMap<>()); 
{code}
 

If the startingOffset for the given split is >= 0 then a new entry would be 
added to the map from the previous step. [KafkaSourceReader line 
113|https://github.com/apache/flink-connector-kafka/blob/b09928d5ef290f2a046dc1fe40b4c5cebe76f997/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/KafkaSourceReader.java#L113]
{code:java}
if (split.getStartingOffset() >= 0) {
offsetsMap.put(
split.getTopicPartition(),
new OffsetAndMetadata(split.getStartingOffset()));
}{code}
If the starting offset is smaller than 0 then this would leave the offsetMap 
created in step 1 empty. We can see from the logs that the startingOffset is -3 
when the splits are added to the reader.

 
{code:java}
Adding split(s) to reader: [[Partition: source-events-20, StartingOffset: 1, 
StoppingOffset: -9223372036854775808], [Partition: source-events-44, 
StartingOffset: -3, StoppingOffset: -9223372036854775808], [Partition: 
source-events-12, StartingOffset: -3, StoppingOffset: -9223372036854775808], 
[Partition: source-events-36, StartingOffset: 1, StoppingOffset: 
-9223372036854775808], [Partition: source-events-4, StartingOffset: -3, 
StoppingOffset: -9223372036854775808], [Partition: source-events-28, 
StartingOffset: -3, StoppingOffset: -9223372036854775808]]{code}
 

 

The offsetsToCommit map is cleaned from entries once they have been committed 
to Kafka which happens during the callback function that is passed to the 
KafkaSourceFetcherManager.commitOffsets method in 
KafkaSourceReader.notifyCheckpointComplete method.

However if the committedPartitions is empty for the given checkpoint, then the 
KafkaSourceFetcherManager.commitOffsets method returns.  
[KafkaSourceFetcherManager line 
78|https://github.com/apache/flink-connector-kafka/blob/b09928d5ef290f2a046dc1fe40b4c5cebe76f997/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/fetcher/KafkaSourceFetcherManager.java#L78]
{code:java}
if (offsetsToCommit.isEmpty()) {
return;
} {code}
We can observe from the logs that indeed an empty map is encountered at this 
step:
{code:java}
Committing offsets {}{code}
*Conclusion*

It seems that an empty map gets added per each checkpoint to offsetsToCommit 
map. Since the startingOffset in our case is -3 then the empty map never gets 
filled. During the offset commit phase the offsets for these checkpoints are 
ignored, since there is nothing to 

Re: Operator 1.6 to Olm

2023-10-10 Thread Gyula Fóra
That would be great David, thank you!

Gyula

On Tue, 10 Oct 2023 at 14:13, David Radley  wrote:

> Hi,
> I notice that the latest version in olm of the operator is 1.5. I plan to
> run the scripts to publish the 1.6 Flink operator to olm,
>  Kind regards, David.
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>


Operator 1.6 to Olm

2023-10-10 Thread David Radley
Hi,
I notice that the latest version in olm of the operator is 1.5. I plan to run 
the scripts to publish the 1.6 Flink operator to olm,
 Kind regards, David.

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


Re:Re: [DISCUSS] FLIP-373: Support Configuring Different State TTLs using SQL Hint

2023-10-10 Thread Xuyang
Hi, Jane.


I think this syntax will be easier for users to set operator ttl. So big +1. I 
left some minor comments here.


I notice that using STATE_TTL hints wrongly will not throw any exceptions. But 
it seems that in the current join hint scenario, 
if user uses an unknown table name as the chosen side, a validation exception 
will be thrown. 
Maybe we should distinguish which exceptions need to be thrown explicitly.




--

Best!
Xuyang





At 2023-10-10 18:23:55, "Jane Chan"  wrote:
>Hi Feng,
>
>Thank you for your valuable comments. The reason for not including the
>scenarios above is as follows:
>
>For <1>, the automatically inferred stateful operators are not easily
>expressible in SQL. This issue was discussed in FLIP-292, and besides
>ChangelogNormalize, SinkUpsertMateralizer also faces the same problem.
>
>For <2> and <3>, the challenge lies in internal implementation. During the
>default_rewrite phase, the row_number expression in LogicalProject is
>transformed into LogicalWindow by Calcite's
>CoreRules#PROJECT_TO_LOGICAL_PROJECT_AND_WINDOW. However, CalcRelSplitter
>does not pass the hints as an input argument when creating LogicalWindow,
>resulting in the loss of the hint at this step. To support this, we may
>need to rewrite some optimization rules in Calcite, which could be a
>follow-up work if required.
>
>Best,
>Jane
>
>On Tue, Oct 10, 2023 at 1:40 AM Feng Jin  wrote:
>
>> Hi Jane,
>>
>> Thank you for proposing this FLIP.
>>
>> I believe that this FLIP will greatly enhance the flexibility of setting
>> state, and by setting different operators' TTL, it can also increase job
>> stability, especially in regular join scenarios.
>> The parameter design is very concise, big +1 for this, and it is also
>> relatively easy to use for users.
>>
>>
>> I have a small question: in the FLIP, it only mentions join and group.
>> Should we also consider other scenarios?
>>
>> 1. the auto generated deduplicate operator[1].
>> 2. the deduplicate query[2].
>> 3. the topN query[3].
>>
>> [1]
>>
>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-source-cdc-events-duplicate
>> [2]
>>
>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/deduplication/
>> [3]
>>
>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/topn/
>>
>>
>> Best,
>> Feng
>>
>> On Sun, Oct 8, 2023 at 5:53 PM Jane Chan  wrote:
>>
>> > Hi devs,
>> >
>> > I'd like to initiate a discussion on FLIP-373: Support Configuring
>> > Different State TTLs using SQL Hint [1]. This proposal is on top of the
>> > FLIP-292 [2] to address typical scenarios with unambiguous semantics and
>> > hint propagation.
>> >
>> > I'm looking forward to your opinions!
>> >
>> >
>> > [1]
>> >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-373%3A+Support+Configuring+Different+State+TTLs+using+SQL+Hint
>> > [2]
>> >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-292%3A+Enhance+COMPILED+PLAN+to+support+operator-level+state+TTL+configuration
>> >
>> > Best,
>> > Jane
>> >
>>


[jira] [Created] (FLINK-33230) Support Expanding ExecutionGraph to StreamGraph in Flink

2023-10-10 Thread Yu Chen (Jira)
Yu Chen created FLINK-33230:
---

 Summary: Support Expanding ExecutionGraph to StreamGraph in Flink
 Key: FLINK-33230
 URL: https://issues.apache.org/jira/browse/FLINK-33230
 Project: Flink
  Issue Type: Improvement
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-373: Support Configuring Different State TTLs using SQL Hint

2023-10-10 Thread Jane Chan
Hi Feng,

Thank you for your valuable comments. The reason for not including the
scenarios above is as follows:

For <1>, the automatically inferred stateful operators are not easily
expressible in SQL. This issue was discussed in FLIP-292, and besides
ChangelogNormalize, SinkUpsertMateralizer also faces the same problem.

For <2> and <3>, the challenge lies in internal implementation. During the
default_rewrite phase, the row_number expression in LogicalProject is
transformed into LogicalWindow by Calcite's
CoreRules#PROJECT_TO_LOGICAL_PROJECT_AND_WINDOW. However, CalcRelSplitter
does not pass the hints as an input argument when creating LogicalWindow,
resulting in the loss of the hint at this step. To support this, we may
need to rewrite some optimization rules in Calcite, which could be a
follow-up work if required.

Best,
Jane

On Tue, Oct 10, 2023 at 1:40 AM Feng Jin  wrote:

> Hi Jane,
>
> Thank you for proposing this FLIP.
>
> I believe that this FLIP will greatly enhance the flexibility of setting
> state, and by setting different operators' TTL, it can also increase job
> stability, especially in regular join scenarios.
> The parameter design is very concise, big +1 for this, and it is also
> relatively easy to use for users.
>
>
> I have a small question: in the FLIP, it only mentions join and group.
> Should we also consider other scenarios?
>
> 1. the auto generated deduplicate operator[1].
> 2. the deduplicate query[2].
> 3. the topN query[3].
>
> [1]
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-source-cdc-events-duplicate
> [2]
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/deduplication/
> [3]
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/topn/
>
>
> Best,
> Feng
>
> On Sun, Oct 8, 2023 at 5:53 PM Jane Chan  wrote:
>
> > Hi devs,
> >
> > I'd like to initiate a discussion on FLIP-373: Support Configuring
> > Different State TTLs using SQL Hint [1]. This proposal is on top of the
> > FLIP-292 [2] to address typical scenarios with unambiguous semantics and
> > hint propagation.
> >
> > I'm looking forward to your opinions!
> >
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-373%3A+Support+Configuring+Different+State+TTLs+using+SQL+Hint
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-292%3A+Enhance+COMPILED+PLAN+to+support+operator-level+state+TTL+configuration
> >
> > Best,
> > Jane
> >
>


Re: [DISCUSS] FLIP-370 : Support Balanced Tasks Scheduling

2023-10-10 Thread Rui Fan
Hi Zhu,

Thanks for your clarification!

I misunderstood before, it's clear now.

Best,
Rui

On Tue, Oct 10, 2023 at 6:17 PM Zhu Zhu  wrote:

> Hi Rui,
>
> Not sure if I understand your question correctly. The two modes are not
> the same:
> {taskmanager.load-balance.mode: Slots} = {cluster.evenly-spread-out-slots:
> true, slot.sharing-strategy: LOCAL_INPUT_PREFERRED}
> {taskmanager.load-balance.mode: Tasks} = {cluster.evenly-spread-out-slots:
> true, slot.sharing-strategy: TASK_BALANCED_PREFERRED}
>
> Thanks,
> Zhu
>
> Rui Fan <1996fan...@gmail.com> 于2023年10月10日周二 10:27写道:
>
>> Hi Zhu,
>>
>> Thanks for your feedback!
>>
>> >> 2. When it's set to Tasks, how to assign slots to TM?
>> > It's option2 at the moment. However, I think it's just implementation
>> > details and can be changed/refined later.
>> >
>> > As you mentioned in another comment, 'taskmanager.load-balance.mode' is
>> > a user oriented configuration. The goal is to achieve load balance,
>> while
>> > the load can be defined as allocated slots or assigned tasks.
>> > The 'Tasks' mode, just the same as what is proposed in the FLIP,
>> currently
>> > use the mechanism of 'cluster.evenly-spread-out-slots' to help to
>> achieve
>> > balanced number of tasks. It's not perfect, but has acceptable
>> effectiveness
>> > and lower implementation complexity.
>> >
>> > The 'Slots' mode is needed for compatible reasons. Users that are
>> satisfied
>> > with the current ability of 'cluster.evenly-spread-out-slots' can
>> continue
>> > using it after the config 'cluster.evenly-spread-out-slots' is
>> deprecated.
>>
>> IIUC, the 'Slots' mode is needed for compatibility with
>> 'cluster.evenly-spread-out-slots'.
>> The reason I ask this question is: if the behavior and logic of 'Slots'
>> and
>> 'Tasks' are exactly the same, it feels a bit strange to define two
>> enumerations.
>> And it may cause confusion to users.
>>
>> If they are totally the same, how about combining them to SlotsAndTasks?
>> It can be compatible with 'cluster.evenly-spread-out-slots', and avoid
>> the redundant enum. Of course, if the name(SlotsAndTasks) is ugly,
>> we can discuss it. The core idea is combining them.
>>
>> WDYT?
>>
>> Best,
>> Rui
>>
>> On Mon, Oct 9, 2023 at 3:24 PM Zhu Zhu  wrote:
>>
>>> Thanks for the response, Rui and Yuepeng.
>>>
>>> >> Rui
>>> > 1. The default value is None, right?
>>> Exactly.
>>>
>>> > 2. When it's set to Tasks, how to assign slots to TM?
>>> It's option2 at the moment. However, I think it's just implementation
>>> details and can be changed/refined later.
>>>
>>> As you mentioned in another comment, 'taskmanager.load-balance.mode' is
>>> a user oriented configuration. The goal is to achieve load balance,
>>> while
>>> the load can be defined as allocated slots or assigned tasks.
>>> The 'Tasks' mode, just the same as what is proposed in the FLIP,
>>> currently
>>> use the mechanism of 'cluster.evenly-spread-out-slots' to help to achieve
>>> balanced number of tasks. It's not perfect, but has acceptable
>>> effectiveness
>>> and lower implementation complexity.
>>>
>>> The 'Slots' mode is needed for compatible reasons. Users that are
>>> satisfied
>>> with the current ability of 'cluster.evenly-spread-out-slots' can
>>> continue
>>> using it after the config 'cluster.evenly-spread-out-slots' is
>>> deprecated.
>>>
>>>
>>> >> Yuepeng
>>> I think what users want is load balance. The combination is
>>> implementation
>>> details and should be transparent to users.
>>>
>>> Meanwhile, I think locality does not entirely conflict with load
>>> balance. In fact,
>>> they should be both considered when assigning tasks. Usually, state
>>> locality
>>> should have the highest priority, and input locality can also be taken
>>> care
>>> of when trying to balance tasks to slots and TMs. We can see that the
>>> most
>>> important input locality, i.e. forward, is always covered in this FLIP
>>> when
>>> computing slot sharing groups. It can be further optimized if we find it
>>> problematic.
>>>
>>> Thanks,
>>> Zhu
>>>
>>> Yangze Guo  于2023年10月8日周日 13:53写道:
>>>
 Thanks for the updates, Rui.

 It does seem challenging to ensure evenness in slot deployment unless
 we introduce batch slot requests in SlotPool. However, one possibility
 is to add a delay of around 50ms during the SlotPool's resource
 requirement declaration to the ResourceManager, similar to the
 checkResourceRequirementsWithDelay in the SlotManager. In most cases,
 this delay would allow the SlotManager to see all resource
 requirements, then it can allocate the slot more evenly. As a side
 effect, it could also significantly reduce the number of RPC messages
 to the ResourceManager, which could become a single-point bottleneck
 in OLAP scenarios. WDYT?

 Best,
 Yangze Guo

 On Sat, Oct 7, 2023 at 5:52 PM Rui Fan <1996fan...@gmail.com> wrote:
 >
 > Hi Yangze,
 >
 > Thanks for your quick response!
 >
 > 

FW: RE: [DISCUSS] FLIP-368 Reorganize the exceptions thrown in state interfaces

2023-10-10 Thread David Radley
Hi,
I notice 
https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/api/common/state/ValueState.html
 is an external API. I am concerned that this change will break existing 
applications using the old interface, they are likely to have catches / throws 
around the existing checked Exceptions.

If we go with RunTimeException, I would suggest that this sort of breaking 
change should be done on a Flink version change, where it is appropriate to 
make breaking changes to the API with associated documentation.

If we want this change on a minor release,  we could create a new class 
ValueState2– that is used internally with the cleaned up Exceptions, but still 
expose the old class and Exceptions for existing external applications. I guess 
new applications could use the new ValueState2 .

What do you think?
Kind regards, David.


From: David Radley 
Date: Tuesday, 10 October 2023 at 09:49
To: dev@flink.apache.org 
Subject: [EXTERNAL] RE: [DISCUSS] FLIP-368 Reorganize the exceptions thrown in 
state interfaces
Hi ,
The argument seems to be that the errors cannot be acted on so should be 
runtime exceptions. I want to confirm that none of these errors could / should 
be retriable. If there is a possibility that the state is available at some 
time later then I assume a checked retriable Exception would be appropriate for 
those cases; and be part of the contract with the caller. Can we be sure that 
there is no possibility that the state will become available; if so then I 
agree that a runtime Exception is appropriate. What do you think?



Kind regards, David.


From: Zakelly Lan 
Date: Monday, 9 October 2023 at 18:12
To: dev@flink.apache.org 
Subject: [EXTERNAL] Re: [DISCUSS] FLIP-368 Reorganize the exceptions thrown in 
state interfaces
Hi everyone,

It seems we're gradually reaching a consensus. So I would like to
start a vote after 72 hours if there are no further discussions.

Please let me know if you have any concerns, thanks!


Best,
Zakelly


On Sat, Oct 7, 2023 at 4:07 PM Zakelly Lan  wrote:
>
> Hi Jing,
>
> Sorry for the late reply! I agree with you that we do not expect users
> to do anything with Flink and we won't "bother" them with those
> exceptions. However, users can still catch the `Throwable` and perform
> any necessary logging activities, similar to how they use Java
> Collection interfaces.
>
>
> Thanks for your insights!
>
> Best,
> Zakelly
>
> On Thu, Sep 21, 2023 at 8:43 PM Jing Ge  wrote:
> >
> > Fair enough! Thanks Zakelly for the information. Afaic, even users can do
> > nothing with Flink, they still can do something in their territory, at
> > least doing some logging and metrics stuff, or triggering some other
> > services in their ecosystem. After all, the Flink jobs they build are part
> > of their service component. It didn't change the fact that we are going to
> > use the anti-pattern. Just because we didn't expect users to do
> > anything with Flink, does not mean users don't expect to do something with
> > the expected exception. Anyway, I am open to hearing different opinions.
> >
> > Best regards,
> > Jing
> >
> > On Thu, Sep 21, 2023 at 7:02 AM Zakelly Lan  wrote:
> >
> > > Hi Martijn,
> > >
> > > Thanks for the reminder!
> > >
> > > This FLIP proposes a change to the state API that is annotated as
> > > @PublicEvolving and targets version 1.19.  I have clarified this in
> > > the "Proposed Change" section of the FLIP.
> > >
> > >
> > > Hi Jing,
> > >
> > > Thanks for sharing your thoughts! Here are my opinions:
> > >
> > > 1. The exceptions of the state API are usually treated as critical
> > > ones. In other words, if anything goes wrong with state accessing, the
> > > element processing cannot proceed and the job should fail. Flink users
> > > may not know what to do when they encounter these exceptions. I
> > > believe this is the main reason why we want to replace them with
> > > unchecked exceptions.
> > > 2. There have also been some further discussions[1][2] from Stephan
> > > and Shixiaogang below the one you pointed out [3], and it seems they
> > > come to an agreement to use unchecked exceptions. After reviewing the
> > > entire discussion on that PR, I think their arguments are reasonable
> > > given the use case.
> > >
> > > Looking forward to your feedback.
> > >
> > >
> > > Best,
> > > Zakelly
> > >
> > > [1] https://github.com/apache/flink/pull/3380#issuecomment-286807853
> > > [2] https://github.com/apache/flink/pull/3380#issuecomment-286932133
> > > [3] https://github.com/apache/flink/pull/3380#issuecomment-281631160
> > >
> > > On Thu, Sep 21, 2023 at 1:27 AM Jing Ge 
> > > wrote:
> > > >
> > > > sorry, typo: It is a known "anti-pattern" instead of "ant-pattern"
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Wed, Sep 20, 2023 at 7:23 PM Jing Ge  wrote:
> > > >
> > > > > Hi Zakelly,
> > > > >
> > > > > Thanks for driving this topic. From good software engineering's
> > > > > perspective, I have different 

Re: [DISCUSS] FLIP-370 : Support Balanced Tasks Scheduling

2023-10-10 Thread Zhu Zhu
Hi Rui,

Not sure if I understand your question correctly. The two modes are not the
same:
{taskmanager.load-balance.mode: Slots} = {cluster.evenly-spread-out-slots:
true, slot.sharing-strategy: LOCAL_INPUT_PREFERRED}
{taskmanager.load-balance.mode: Tasks} = {cluster.evenly-spread-out-slots:
true, slot.sharing-strategy: TASK_BALANCED_PREFERRED}

Thanks,
Zhu

Rui Fan <1996fan...@gmail.com> 于2023年10月10日周二 10:27写道:

> Hi Zhu,
>
> Thanks for your feedback!
>
> >> 2. When it's set to Tasks, how to assign slots to TM?
> > It's option2 at the moment. However, I think it's just implementation
> > details and can be changed/refined later.
> >
> > As you mentioned in another comment, 'taskmanager.load-balance.mode' is
> > a user oriented configuration. The goal is to achieve load balance,
> while
> > the load can be defined as allocated slots or assigned tasks.
> > The 'Tasks' mode, just the same as what is proposed in the FLIP,
> currently
> > use the mechanism of 'cluster.evenly-spread-out-slots' to help to
> achieve
> > balanced number of tasks. It's not perfect, but has acceptable
> effectiveness
> > and lower implementation complexity.
> >
> > The 'Slots' mode is needed for compatible reasons. Users that are
> satisfied
> > with the current ability of 'cluster.evenly-spread-out-slots' can
> continue
> > using it after the config 'cluster.evenly-spread-out-slots' is
> deprecated.
>
> IIUC, the 'Slots' mode is needed for compatibility with
> 'cluster.evenly-spread-out-slots'.
> The reason I ask this question is: if the behavior and logic of 'Slots'
> and
> 'Tasks' are exactly the same, it feels a bit strange to define two
> enumerations.
> And it may cause confusion to users.
>
> If they are totally the same, how about combining them to SlotsAndTasks?
> It can be compatible with 'cluster.evenly-spread-out-slots', and avoid
> the redundant enum. Of course, if the name(SlotsAndTasks) is ugly,
> we can discuss it. The core idea is combining them.
>
> WDYT?
>
> Best,
> Rui
>
> On Mon, Oct 9, 2023 at 3:24 PM Zhu Zhu  wrote:
>
>> Thanks for the response, Rui and Yuepeng.
>>
>> >> Rui
>> > 1. The default value is None, right?
>> Exactly.
>>
>> > 2. When it's set to Tasks, how to assign slots to TM?
>> It's option2 at the moment. However, I think it's just implementation
>> details and can be changed/refined later.
>>
>> As you mentioned in another comment, 'taskmanager.load-balance.mode' is
>> a user oriented configuration. The goal is to achieve load balance, while
>> the load can be defined as allocated slots or assigned tasks.
>> The 'Tasks' mode, just the same as what is proposed in the FLIP, currently
>> use the mechanism of 'cluster.evenly-spread-out-slots' to help to achieve
>> balanced number of tasks. It's not perfect, but has acceptable
>> effectiveness
>> and lower implementation complexity.
>>
>> The 'Slots' mode is needed for compatible reasons. Users that are
>> satisfied
>> with the current ability of 'cluster.evenly-spread-out-slots' can continue
>> using it after the config 'cluster.evenly-spread-out-slots' is deprecated.
>>
>>
>> >> Yuepeng
>> I think what users want is load balance. The combination is implementation
>> details and should be transparent to users.
>>
>> Meanwhile, I think locality does not entirely conflict with load balance.
>> In fact,
>> they should be both considered when assigning tasks. Usually, state
>> locality
>> should have the highest priority, and input locality can also be taken
>> care
>> of when trying to balance tasks to slots and TMs. We can see that the most
>> important input locality, i.e. forward, is always covered in this FLIP
>> when
>> computing slot sharing groups. It can be further optimized if we find it
>> problematic.
>>
>> Thanks,
>> Zhu
>>
>> Yangze Guo  于2023年10月8日周日 13:53写道:
>>
>>> Thanks for the updates, Rui.
>>>
>>> It does seem challenging to ensure evenness in slot deployment unless
>>> we introduce batch slot requests in SlotPool. However, one possibility
>>> is to add a delay of around 50ms during the SlotPool's resource
>>> requirement declaration to the ResourceManager, similar to the
>>> checkResourceRequirementsWithDelay in the SlotManager. In most cases,
>>> this delay would allow the SlotManager to see all resource
>>> requirements, then it can allocate the slot more evenly. As a side
>>> effect, it could also significantly reduce the number of RPC messages
>>> to the ResourceManager, which could become a single-point bottleneck
>>> in OLAP scenarios. WDYT?
>>>
>>> Best,
>>> Yangze Guo
>>>
>>> On Sat, Oct 7, 2023 at 5:52 PM Rui Fan <1996fan...@gmail.com> wrote:
>>> >
>>> > Hi Yangze,
>>> >
>>> > Thanks for your quick response!
>>> >
>>> > Sorry, I re-read the 2.2.2 part[1] about the Waiting Mechanism, I found
>>> > it isn't clear. The root cause of introducing the waiting mechanism is
>>> > that the slot requests are sent from JobMaster to SlotPool is
>>> > one by one instead of one whole batch. I have rewritten the 2.2.2 part,
>>> > please 

RE: [DISCUSS] FLIP-368 Reorganize the exceptions thrown in state interfaces

2023-10-10 Thread David Radley
Hi ,
The argument seems to be that the errors cannot be acted on so should be 
runtime exceptions. I want to confirm that none of these errors could / should 
be retriable. If there is a possibility that the state is available at some 
time later then I assume a checked retriable Exception would be appropriate for 
those cases; and be part of the contract with the caller. Can we be sure that 
there is no possibility that the state will become available; if so then I 
agree that a runtime Exception is appropriate. What do you think?



Kind regards, David.


From: Zakelly Lan 
Date: Monday, 9 October 2023 at 18:12
To: dev@flink.apache.org 
Subject: [EXTERNAL] Re: [DISCUSS] FLIP-368 Reorganize the exceptions thrown in 
state interfaces
Hi everyone,

It seems we're gradually reaching a consensus. So I would like to
start a vote after 72 hours if there are no further discussions.

Please let me know if you have any concerns, thanks!


Best,
Zakelly


On Sat, Oct 7, 2023 at 4:07 PM Zakelly Lan  wrote:
>
> Hi Jing,
>
> Sorry for the late reply! I agree with you that we do not expect users
> to do anything with Flink and we won't "bother" them with those
> exceptions. However, users can still catch the `Throwable` and perform
> any necessary logging activities, similar to how they use Java
> Collection interfaces.
>
>
> Thanks for your insights!
>
> Best,
> Zakelly
>
> On Thu, Sep 21, 2023 at 8:43 PM Jing Ge  wrote:
> >
> > Fair enough! Thanks Zakelly for the information. Afaic, even users can do
> > nothing with Flink, they still can do something in their territory, at
> > least doing some logging and metrics stuff, or triggering some other
> > services in their ecosystem. After all, the Flink jobs they build are part
> > of their service component. It didn't change the fact that we are going to
> > use the anti-pattern. Just because we didn't expect users to do
> > anything with Flink, does not mean users don't expect to do something with
> > the expected exception. Anyway, I am open to hearing different opinions.
> >
> > Best regards,
> > Jing
> >
> > On Thu, Sep 21, 2023 at 7:02 AM Zakelly Lan  wrote:
> >
> > > Hi Martijn,
> > >
> > > Thanks for the reminder!
> > >
> > > This FLIP proposes a change to the state API that is annotated as
> > > @PublicEvolving and targets version 1.19.  I have clarified this in
> > > the "Proposed Change" section of the FLIP.
> > >
> > >
> > > Hi Jing,
> > >
> > > Thanks for sharing your thoughts! Here are my opinions:
> > >
> > > 1. The exceptions of the state API are usually treated as critical
> > > ones. In other words, if anything goes wrong with state accessing, the
> > > element processing cannot proceed and the job should fail. Flink users
> > > may not know what to do when they encounter these exceptions. I
> > > believe this is the main reason why we want to replace them with
> > > unchecked exceptions.
> > > 2. There have also been some further discussions[1][2] from Stephan
> > > and Shixiaogang below the one you pointed out [3], and it seems they
> > > come to an agreement to use unchecked exceptions. After reviewing the
> > > entire discussion on that PR, I think their arguments are reasonable
> > > given the use case.
> > >
> > > Looking forward to your feedback.
> > >
> > >
> > > Best,
> > > Zakelly
> > >
> > > [1] https://github.com/apache/flink/pull/3380#issuecomment-286807853
> > > [2] https://github.com/apache/flink/pull/3380#issuecomment-286932133
> > > [3] https://github.com/apache/flink/pull/3380#issuecomment-281631160
> > >
> > > On Thu, Sep 21, 2023 at 1:27 AM Jing Ge 
> > > wrote:
> > > >
> > > > sorry, typo: It is a known "anti-pattern" instead of "ant-pattern"
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Wed, Sep 20, 2023 at 7:23 PM Jing Ge  wrote:
> > > >
> > > > > Hi Zakelly,
> > > > >
> > > > > Thanks for driving this topic. From good software engineering's
> > > > > perspective, I have different thoughts:
> > > > >
> > > > > 1. The idea to get rid of all checked Exceptions and replace them with
> > > > > unchecked Exceptions is a known ant-pattern: "Generally speaking, do
> > > not
> > > > > throw a RuntimeException or create a subclass of RuntimeException
> > > simply
> > > > > because you don't want to be bothered with specifying the exceptions
> > > your
> > > > > methods can throw." [1] Checked Exceptions mean expected exceptions
> > > that
> > > > > can help developers find a way to catch them and decide what to do. It
> > > is
> > > > > part of the public API signature that can help developers build robust
> > > > > systems. We should not mix concepts and build expected exceptions with
> > > > > unchecked Java Exception classes.
> > > > > 2. The comment Stephan left [2] clearly pointed out that we should
> > > avoid
> > > > > using generic Java Exceptions, and "find some more 'specific'
> > > exceptions
> > > > > for the signature, like throws IOException or throws
> > > StateAccessException."
> > > > > So, the idea is to 

[jira] [Created] (FLINK-33229) Moving Java FlinkRecomputeStatisticsProgram from scala package to java package

2023-10-10 Thread dalongliu (Jira)
dalongliu created FLINK-33229:
-

 Summary: Moving Java FlinkRecomputeStatisticsProgram from scala 
package to java package
 Key: FLINK-33229
 URL: https://issues.apache.org/jira/browse/FLINK-33229
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: dalongliu
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] Release 1.18.0, release candidate #1

2023-10-10 Thread Jing Ge
Hi folks,

In the release managers' sync-up meeting today, we have discussed all
issues and came to a consensus that there are no blocking issues for 1.18
release. Since there were some new commits after rc1, I will create a rc2
now. Thanks again for your contributions.

Best regards,
Jing



On Mon, Oct 9, 2023 at 12:10 PM Sergey Nuyanzin  wrote:

> Hi Jing,
>
> Thanks for looking into this
>
> I checked it against opensearch connector [1] and it is green now
> Also checked against hbase [2] which is now failing because of a different
> issue
> however anyway it was able to go further (before it failed at Download
> Flink binary task) than when it was not publicly accessible.
> Also a had a look at kafka connector [3] where Download Flink binary task
> is also completed
> Thus i don't see a connector where Download Flink binary is failing, so i
> would consider this as resolved
>
> At the same time there is another issue which was mentioned above [4]. It
> looks like there were some changes in 1.18.0 like [5], [6] which
> potentially could result in this issue
> Shouldn't we consider it as a blocker?
>
> [1]
>
> https://github.com/apache/flink-connector-opensearch/actions/runs/6312803412?pr=32
> [2]
>
> https://github.com/apache/flink-connector-hbase/actions/runs/6210320267/job/17518738266
> [3]
>
> https://github.com/apache/flink-connector-kafka/actions/runs/6453974340/job/17518572643
> [4] https://issues.apache.org/jira/browse/FLINK-33186
> [5] https://issues.apache.org/jira/browse/FLINK-32996
> [6] https://issues.apache.org/jira/browse/FLINK-32907
>
>
> On Mon, Oct 9, 2023 at 10:06 AM Jing Ge 
> wrote:
>
> > Hi Folks,
> >
> > Ververica has made the files publicly accessible. Could anyone check if
> any
> > connector build works again?
> >
> > Best regards,
> > Jing
> >
> > On Mon, Oct 9, 2023 at 9:07 AM Jing Ge  wrote:
> >
> > > Hi Sergey and devs,
> > >
> > > Thanks for bringing this to our attention. I am open to discuss that. I
> > > have the following thoughts:
> > >
> > > 1. Like I already mentioned in many other threads, build issues in
> > > downstream repos should not block upstream release. I understand the
> > > concern that developers want to have stable connectors. But it violates
> > the
> > > intention of connector externalization.
> > > 2. It is expensive to download Flink jar(roughly 500M) from S3 for each
> > PR
> > > and nightly build of each connector. Does it make sense to leverage
> [1].
> > > Many Flink docs have been using it.
> > > 3. I will check internally at Ververica to see if we could make the
> file
> > > publicly accessible to temporarily solve this issue.
> > >
> > > Looking forward to your feedback.
> > >
> > > Best regards,
> > > Jing
> > >
> > > [1] https://nightlies.apache.org/flink/
> > >
> > > On Fri, Oct 6, 2023 at 2:03 PM Konstantin Knauf 
> > wrote:
> > >
> > >> Hi everyone,
> > >>
> > >> I've just opened a PR for the release announcement [1] and I am
> looking
> > >> forward to reviews and feedback.
> > >>
> > >> Cheers,
> > >>
> > >> Konstantin
> > >>
> > >> [1] https://github.com/apache/flink-web/pull/680
> > >>
> > >> Am Fr., 6. Okt. 2023 um 11:03 Uhr schrieb Sergey Nuyanzin <
> > >> snuyan...@gmail.com>:
> > >>
> > >> > sorry for not mentioning it in previous mail
> > >> >
> > >> > based on the reason above I'm
> > >> > -1 (non-binding)
> > >> >
> > >> > also there is one more issue [1]
> > >> > which blocks all the externalised connectors testing against the
> most
> > >> > recent commits in
> > >> > to corresponding branches
> > >> > [1] https://issues.apache.org/jira/browse/FLINK-33175
> > >> >
> > >> >
> > >> > On Thu, Oct 5, 2023 at 11:19 PM Sergey Nuyanzin <
> snuyan...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Thanks for creating RC1
> > >> > >
> > >> > > * Downloaded artifacts
> > >> > > * Built from sources
> > >> > > * Verified checksums and gpg signatures
> > >> > > * Verified versions in pom files
> > >> > > * Checked NOTICE, LICENSE files
> > >> > >
> > >> > > The strange thing I faced is
> > >> > >
> > >>
> CheckpointAfterAllTasksFinishedITCase.testRestoreAfterSomeTasksFinished
> > >> > > fails on AZP [1]
> > >> > >
> > >> > > which looks like it is related to [2], [3] fixed  in 1.18.0 (not
> > 100%
> > >> > > sure).
> > >> > >
> > >> > >
> > >> > > [1] https://issues.apache.org/jira/browse/FLINK-33186
> > >> > > [2] https://issues.apache.org/jira/browse/FLINK-32996
> > >> > > [3] https://issues.apache.org/jira/browse/FLINK-32907
> > >> > >
> > >> > > On Tue, Oct 3, 2023 at 2:53 PM Ferenc Csaky
> > >> 
> > >> > > wrote:
> > >> > >
> > >> > >> Thanks everyone for the efforts!
> > >> > >>
> > >> > >> Checked the following:
> > >> > >>
> > >> > >> - Downloaded artifacts
> > >> > >> - Built Flink from source
> > >> > >> - Verified checksums/signatures
> > >> > >> - Verified NOTICE, LICENSE files
> > >> > >> - Deployed dummy SELECT job via SQL gateway on standalone
> cluster,
> > >> > things
> > >> > >> seemed fine according to the log files

Re: [DISCUSS] FLIP-375: Built-in cross-platform powerful java profiler on taskmanagers

2023-10-10 Thread Jing Ge
Hi Yun,

3% extra cost sounds good! It could be useful, if the reference could be
added into the FLIP. Thanks for answering my questions.

Best regards,
Jing

On Tue, Oct 10, 2023 at 9:15 AM Yun Tang  wrote:

> Hi Jing,
>
> First, developers would accept the little overhead when debugging the
> performance issues. Secondly, according to the async-profiler's report, it
> should only have less than 3% extra cost. I also verified it with a
> CPU-intensive ETL Flink job with this profiler, it does not show any
> obvious performance regression during profiling.
>
>
>
> [1] https://github.com/async-profiler/async-profiler/issues/14
>
> Best
> Yun Tang
>
> 
> From: Jing Ge 
> Sent: Tuesday, October 10, 2023 12:05
> To: dev@flink.apache.org 
> Subject: Re: [DISCUSS] FLIP-375: Built-in cross-platform powerful java
> profiler on taskmanagers
>
> Thanks Yun for your clarification. Especially thanks Rui for your
> informative elaboration. Since we will have two flame graphs, I would
> suggest updating Flink documentation to help users understand it and know
> when to use which one. The content provided by Rui is already a really good
> starting point. Like it!
>
> Yun
>
> Theoretically yeah, I am with you. Practically, it will help get more
> attraction and attention, if you can provide the real (test) metrics of how
> "extremely light" means, e.g. engineering mindset. Otherwise, each serious
> user will have to evaluate the performance on her/his own before using it
> in production. WDYT?
>
> Best regards,
> Jing
>
> On Tue, Oct 10, 2023 at 5:29 AM Yun Tang  wrote:
>
> > Hi Jing,
> >
> > I will answer current questions.
> >
> > > 1. will it replace the current flame graph, i.e. the current flame
> graph
> > will be deprecated and removed?
> >
> > Although I think the new java profiler introduced in FLIP-375 is more
> > powerful, just as Rui has replied, I don't think it could replace current
> > flame graph totally.
> >
> >
> > > 2.does it make sense to provide the performance difference between
> enable
> > and disable it?
> >
> > The new java profiler would not introduce any performance impact after we
> > enable it, it will only start work when we trigger the profiling. And
> from
> > our experiences, the overhead of profiling is extremely light.
> >
> >
> >
> > For Rui's question:
> >
> > > Are all process-level flamegraphs stored at BlobStore? Are they
> > maintained by JobManager after sampling? Is there cleanup strategy? Or
> > max-save-count strategy?
> >
> > Yes, we use blobstore to store the process-level flamegraph-files and
> > maintained on taskmanager side. They flamegraph-files will be cleanup
> > automatically once reached to rest.profiling.history-size.
> >
> > Best
> > Yun Tang
> >
> >
> >
> > 
> > From: Rui Fan <1996fan...@gmail.com>
> > Sent: Tuesday, October 10, 2023 10:10
> > To: dev@flink.apache.org 
> > Subject: Re: [DISCUSS] FLIP-375: Built-in cross-platform powerful java
> > profiler on taskmanagers
> >
> > Hi Jing,
> >
> > > 1. will it replace the current flame graph, i.e. the current flame
> graph
> > will be deprecated and removed?
> >
> > I think the current flame graph cannot be removed.
> >
> > As a core contributor to the current flame graph, and I use it almost
> > every week. I would like to clarify the difference between the current
> > flame graph and the flame graph proposed by FLIP-375.
> >
> > @The current flame graph
> >
> > The current flame graph is the operator level or task level, when one
> > operator is the bottleneck of current job. We can see the current
> > flamegraph to check what the operator is doing.
> >
> > It includes three types: On-CPU, Off-CPU and Mixed-Type. The Mixed-Type
> > is very useful, it can detect why operator is slow even if the operator
> > doesn't use CPU. For example, the operator is blocked on querying hbase.
> >
> > It just support the task thread, it means it cannot detect the cpu usage
> of
> > other threads, such as: RocksDB Flush or compaction. This's the
> > limitation of current flamegraph.
> >
> > @The flame graph proposed by FLIP-375.
> >
> > The flamegraph proposed by FLIP-375 works on process level, such as
> > JobManager or TaskManager, so it can monitor all threads. Such as:
> > rocksdb background threads.
> >
> > When the CPU usage of one TM is high, and all tasks are not busy.
> > The new flamegraph will be useful.
> >
> > Back to the question: It includes task or operator thread,
> > why the current flamegraph is still needed?
> >
> > 1. The flamegraph of process level cannot easily distinguish tasks.
> > Especially if there are multiple slots in a TM, and different subtasks of
> > the
> > same task running in multiple slots, their stacks are very similar.
> >
> > 2. The Mixed-Type of current flamegraph may not be replaced by the
> > process-level flame graph.
> >
> > Please correct me if anything is wrong, thanks~
> >
> > Hi Yu,
> >
> > > Jobmanager allows the 

Re: Support AWS SDK V2 for Flink's S3 FileSystem

2023-10-10 Thread Jing Ge
+1 for the s3 file consolidation. We already have many issues with internal
communication and talking to customers. Different file schemas are not very
user friendly, btw.

Best regards,
Jing


On Mon, Oct 9, 2023 at 6:49 PM Matthias Pohl 
wrote:

> I would agree with David's proposal as well.
>
> Would it make sense to come up with some performance comparisons for the
> different S3 implementations in the end? ...just to ensure that we're
> improving things or (at least) don't make things worse. Or is there
> something like that already somewhere?
>
> A bit out of scope:
> We noticed that the FileSystem contract is not well defined. The JavaDoc is
> ambiguous (IMHO) for some operations. For instance, the return value of
> delete [1] is true "if the operation was successful": It's unclear (at
> least to me) what success means here. Is it about the processing (i.e. the
> delete was performed on an existing file) or the outcome (i.e. success is
> reached as well if the file didn't exist in the first place). Removing the
> return type could help to make the contract clearer. In the end, only the
> outcome (i.e. the file doesn't exist anymore) matters in my opinion. A
> similar argument could be applied to mkdirs [2] and rename [3].
>
> That said, I'm not suggesting you adapt the interface as part of your work.
> But it would be good to collect other improvements as part of it. We could
> consider improving the FileSystem interface as part of the 2.0 efforts as a
> follow-up.
>
> [1]
>
> https://github.com/apache/flink/blob/d78d52b27af2550f50b44349d3ec6dc84b966a8a/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java#L695
> [2]
>
> https://github.com/apache/flink/blob/d78d52b27af2550f50b44349d3ec6dc84b966a8a/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java#L706
> [3]
>
> https://github.com/apache/flink/blob/d78d52b27af2550f50b44349d3ec6dc84b966a8a/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java#L773
>
> On Tue, Oct 3, 2023 at 6:25 PM Martijn Visser 
> wrote:
>
> > +1 for David's suggestion. We should get away from the current
> > approach with two abstractions and get to one rock solid one.
> >
> > On Mon, Oct 2, 2023 at 11:13 PM David Morávek  wrote:
> > >
> > > Hi Maomao,
> > >
> > > I wonder whether it would make sense to take a stab at consolidating
> the
> > S3
> > > filesystems instead and introduce a native one. The whole Hadoop
> wrapper
> > > around the S3 client exists for legacy reasons, and it adds complexity
> > and
> > > probably an unnecessary performance penalty.
> > >
> > > If you take a look at the underlying presto implementation, it's
> actually
> > > not too complex to adapt to Flink interfaces (since you're proposing to
> > > maintain a copy of it anyway).
> > >
> > > Overall, the S3 FS is probably the most used one that we have so this
> > could
> > > be rather high impact. It would also eliminate user confusion when
> > choosing
> > > the implementation to use.
> > >
> > > WDYT?
> > >
> > > Best,
> > > D.
> > >
> > > On Fri, Sep 29, 2023 at 2:41 PM Min, Maomao
>  > >
> > > wrote:
> > >
> > > > Hi Flink Dev,
> > > >
> > > > I’m Maomao, a developer from AWS EMR.
> > > >
> > > > Recently, our team is working on adding AWS SDK V2 support for
> Flink’s
> > S3
> > > > Filesystem. During development, we found out that our work was
> blocked
> > by
> > > > Presto. This is because that Presto still uses AWS SDK V1 and won’t
> add
> > > > support for AWS SDK V2 in short term. To unblock, our team proposed
> > several
> > > > options and I’ve created a JIRA issue as here<
> > > > https://issues.apache.org/jira/browse/FLINK-33157>.
> > > >
> > > > Since our team plans to contribute this work back to the community
> > later,
> > > > we’d like to collect feedback from the community about the options we
> > > > proposed in the long term so that the community won’t need to
> duplicate
> > > > this work in the future.
> > > >
> > > > Best,
> > > > Maomao
> > > >
> > > >
> >
>


[jira] [Created] (FLINK-33228) Fix the total current resource calculation in fulfill requirements

2023-10-10 Thread xiangyu feng (Jira)
xiangyu feng created FLINK-33228:


 Summary: Fix the total current resource calculation in fulfill 
requirements
 Key: FLINK-33228
 URL: https://issues.apache.org/jira/browse/FLINK-33228
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Reporter: xiangyu feng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Support AWS SDK V2 for Flink's S3 FileSystem

2023-10-10 Thread Zhao, Kevin
Looks like Maomao was missed from previous replies. Adding back 
@Maomao.

Thanks everyone for your response. We are having some discussion within AWS EMR 
team. Will get back to you very soon.

Regards,
Kevin

From: Matthias Pohl 
Date: Tuesday, October 10, 2023 at 15:35
To: "dev@flink.apache.org" 
Cc: "Zhao, Kevin" , "Josephraj, Prabhu" 
, emr-flink-team 
Subject: RE: [EXTERNAL] Support AWS SDK V2 for Flink's S3 FileSystem


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Just to add a bit more context to the performance test question: What I had in 
mind was the exists call on a (non-existing) directories in a bucket with a lot 
of objects. A comment from one of the SDK contributors about that call was that 
it could be an expensive call in an object store if implemented wrongly. I 
would imagine that this could be a valid concern because the concept of 
directories is not really present in an object store like S3, if I'm not 
mistaken?!

On Mon, Oct 9, 2023 at 6:49 PM Matthias Pohl 
mailto:matthias.p...@aiven.io>> wrote:
I would agree with David's proposal as well.

Would it make sense to come up with some performance comparisons for the 
different S3 implementations in the end? ...just to ensure that we're improving 
things or (at least) don't make things worse. Or is there something like that 
already somewhere?

A bit out of scope:
We noticed that the FileSystem contract is not well defined. The JavaDoc is 
ambiguous (IMHO) for some operations. For instance, the return value of delete 
[1] is true "if the operation was successful": It's unclear (at least to me) 
what success means here. Is it about the processing (i.e. the delete was 
performed on an existing file) or the outcome (i.e. success is reached as well 
if the file didn't exist in the first place). Removing the return type could 
help to make the contract clearer. In the end, only the outcome (i.e. the file 
doesn't exist anymore) matters in my opinion. A similar argument could be 
applied to mkdirs [2] and rename [3].

That said, I'm not suggesting you adapt the interface as part of your work. But 
it would be good to collect other improvements as part of it. We could consider 
improving the FileSystem interface as part of the 2.0 efforts as a follow-up.

[1] 
https://github.com/apache/flink/blob/d78d52b27af2550f50b44349d3ec6dc84b966a8a/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java#L695
[2] 
https://github.com/apache/flink/blob/d78d52b27af2550f50b44349d3ec6dc84b966a8a/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java#L706
[3] 
https://github.com/apache/flink/blob/d78d52b27af2550f50b44349d3ec6dc84b966a8a/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java#L773

On Tue, Oct 3, 2023 at 6:25 PM Martijn Visser 
mailto:martijnvis...@apache.org>> wrote:
+1 for David's suggestion. We should get away from the current
approach with two abstractions and get to one rock solid one.

On Mon, Oct 2, 2023 at 11:13 PM David Morávek 
mailto:d...@apache.org>> wrote:
>
> Hi Maomao,
>
> I wonder whether it would make sense to take a stab at consolidating the S3
> filesystems instead and introduce a native one. The whole Hadoop wrapper
> around the S3 client exists for legacy reasons, and it adds complexity and
> probably an unnecessary performance penalty.
>
> If you take a look at the underlying presto implementation, it's actually
> not too complex to adapt to Flink interfaces (since you're proposing to
> maintain a copy of it anyway).
>
> Overall, the S3 FS is probably the most used one that we have so this could
> be rather high impact. It would also eliminate user confusion when choosing
> the implementation to use.
>
> WDYT?
>
> Best,
> D.
>
> On Fri, Sep 29, 2023 at 2:41 PM Min, Maomao 
> wrote:
>
> > Hi Flink Dev,
> >
> > I’m Maomao, a developer from AWS EMR.
> >
> > Recently, our team is working on adding AWS SDK V2 support for Flink’s S3
> > Filesystem. During development, we found out that our work was blocked by
> > Presto. This is because that Presto still uses AWS SDK V1 and won’t add
> > support for AWS SDK V2 in short term. To unblock, our team proposed several
> > options and I’ve created a JIRA issue as here<
> > https://issues.apache.org/jira/browse/FLINK-33157>.
> >
> > Since our team plans to contribute this work back to the community later,
> > we’d like to collect feedback from the community about the options we
> > proposed in the long term so that the community won’t need to duplicate
> > this work in the future.
> >
> > Best,
> > Maomao
> >
> >


Re: Support AWS SDK V2 for Flink's S3 FileSystem

2023-10-10 Thread Matthias Pohl
Just to add a bit more context to the performance test question: What I had
in mind was the exists call on a (non-existing) directories in a bucket
with a lot of objects. A comment from one of the SDK contributors about
that call was that it could be an expensive call in an object store if
implemented wrongly. I would imagine that this could be a valid concern
because the concept of directories is not really present in an object store
like S3, if I'm not mistaken?!

On Mon, Oct 9, 2023 at 6:49 PM Matthias Pohl  wrote:

> I would agree with David's proposal as well.
>
> Would it make sense to come up with some performance comparisons for the
> different S3 implementations in the end? ...just to ensure that we're
> improving things or (at least) don't make things worse. Or is there
> something like that already somewhere?
>
> A bit out of scope:
> We noticed that the FileSystem contract is not well defined. The JavaDoc
> is ambiguous (IMHO) for some operations. For instance, the return value of
> delete [1] is true "if the operation was successful": It's unclear (at
> least to me) what success means here. Is it about the processing (i.e. the
> delete was performed on an existing file) or the outcome (i.e. success is
> reached as well if the file didn't exist in the first place). Removing the
> return type could help to make the contract clearer. In the end, only the
> outcome (i.e. the file doesn't exist anymore) matters in my opinion. A
> similar argument could be applied to mkdirs [2] and rename [3].
>
> That said, I'm not suggesting you adapt the interface as part of your
> work. But it would be good to collect other improvements as part of it. We
> could consider improving the FileSystem interface as part of the 2.0
> efforts as a follow-up.
>
> [1]
> https://github.com/apache/flink/blob/d78d52b27af2550f50b44349d3ec6dc84b966a8a/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java#L695
> [2]
> https://github.com/apache/flink/blob/d78d52b27af2550f50b44349d3ec6dc84b966a8a/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java#L706
> [3]
> https://github.com/apache/flink/blob/d78d52b27af2550f50b44349d3ec6dc84b966a8a/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java#L773
>
> On Tue, Oct 3, 2023 at 6:25 PM Martijn Visser 
> wrote:
>
>> +1 for David's suggestion. We should get away from the current
>> approach with two abstractions and get to one rock solid one.
>>
>> On Mon, Oct 2, 2023 at 11:13 PM David Morávek  wrote:
>> >
>> > Hi Maomao,
>> >
>> > I wonder whether it would make sense to take a stab at consolidating
>> the S3
>> > filesystems instead and introduce a native one. The whole Hadoop wrapper
>> > around the S3 client exists for legacy reasons, and it adds complexity
>> and
>> > probably an unnecessary performance penalty.
>> >
>> > If you take a look at the underlying presto implementation, it's
>> actually
>> > not too complex to adapt to Flink interfaces (since you're proposing to
>> > maintain a copy of it anyway).
>> >
>> > Overall, the S3 FS is probably the most used one that we have so this
>> could
>> > be rather high impact. It would also eliminate user confusion when
>> choosing
>> > the implementation to use.
>> >
>> > WDYT?
>> >
>> > Best,
>> > D.
>> >
>> > On Fri, Sep 29, 2023 at 2:41 PM Min, Maomao > >
>> > wrote:
>> >
>> > > Hi Flink Dev,
>> > >
>> > > I’m Maomao, a developer from AWS EMR.
>> > >
>> > > Recently, our team is working on adding AWS SDK V2 support for
>> Flink’s S3
>> > > Filesystem. During development, we found out that our work was
>> blocked by
>> > > Presto. This is because that Presto still uses AWS SDK V1 and won’t
>> add
>> > > support for AWS SDK V2 in short term. To unblock, our team proposed
>> several
>> > > options and I’ve created a JIRA issue as here<
>> > > https://issues.apache.org/jira/browse/FLINK-33157>.
>> > >
>> > > Since our team plans to contribute this work back to the community
>> later,
>> > > we’d like to collect feedback from the community about the options we
>> > > proposed in the long term so that the community won’t need to
>> duplicate
>> > > this work in the future.
>> > >
>> > > Best,
>> > > Maomao
>> > >
>> > >
>>
>


Re: [DISCUSS] FLIP-375: Built-in cross-platform powerful java profiler on taskmanagers

2023-10-10 Thread Yun Tang
Hi Jing,

First, developers would accept the little overhead when debugging the 
performance issues. Secondly, according to the async-profiler's report, it 
should only have less than 3% extra cost. I also verified it with a 
CPU-intensive ETL Flink job with this profiler, it does not show any obvious 
performance regression during profiling.



[1] https://github.com/async-profiler/async-profiler/issues/14

Best
Yun Tang


From: Jing Ge 
Sent: Tuesday, October 10, 2023 12:05
To: dev@flink.apache.org 
Subject: Re: [DISCUSS] FLIP-375: Built-in cross-platform powerful java profiler 
on taskmanagers

Thanks Yun for your clarification. Especially thanks Rui for your
informative elaboration. Since we will have two flame graphs, I would
suggest updating Flink documentation to help users understand it and know
when to use which one. The content provided by Rui is already a really good
starting point. Like it!

Yun

Theoretically yeah, I am with you. Practically, it will help get more
attraction and attention, if you can provide the real (test) metrics of how
"extremely light" means, e.g. engineering mindset. Otherwise, each serious
user will have to evaluate the performance on her/his own before using it
in production. WDYT?

Best regards,
Jing

On Tue, Oct 10, 2023 at 5:29 AM Yun Tang  wrote:

> Hi Jing,
>
> I will answer current questions.
>
> > 1. will it replace the current flame graph, i.e. the current flame graph
> will be deprecated and removed?
>
> Although I think the new java profiler introduced in FLIP-375 is more
> powerful, just as Rui has replied, I don't think it could replace current
> flame graph totally.
>
>
> > 2.does it make sense to provide the performance difference between enable
> and disable it?
>
> The new java profiler would not introduce any performance impact after we
> enable it, it will only start work when we trigger the profiling. And from
> our experiences, the overhead of profiling is extremely light.
>
>
>
> For Rui's question:
>
> > Are all process-level flamegraphs stored at BlobStore? Are they
> maintained by JobManager after sampling? Is there cleanup strategy? Or
> max-save-count strategy?
>
> Yes, we use blobstore to store the process-level flamegraph-files and
> maintained on taskmanager side. They flamegraph-files will be cleanup
> automatically once reached to rest.profiling.history-size.
>
> Best
> Yun Tang
>
>
>
> 
> From: Rui Fan <1996fan...@gmail.com>
> Sent: Tuesday, October 10, 2023 10:10
> To: dev@flink.apache.org 
> Subject: Re: [DISCUSS] FLIP-375: Built-in cross-platform powerful java
> profiler on taskmanagers
>
> Hi Jing,
>
> > 1. will it replace the current flame graph, i.e. the current flame graph
> will be deprecated and removed?
>
> I think the current flame graph cannot be removed.
>
> As a core contributor to the current flame graph, and I use it almost
> every week. I would like to clarify the difference between the current
> flame graph and the flame graph proposed by FLIP-375.
>
> @The current flame graph
>
> The current flame graph is the operator level or task level, when one
> operator is the bottleneck of current job. We can see the current
> flamegraph to check what the operator is doing.
>
> It includes three types: On-CPU, Off-CPU and Mixed-Type. The Mixed-Type
> is very useful, it can detect why operator is slow even if the operator
> doesn't use CPU. For example, the operator is blocked on querying hbase.
>
> It just support the task thread, it means it cannot detect the cpu usage of
> other threads, such as: RocksDB Flush or compaction. This's the
> limitation of current flamegraph.
>
> @The flame graph proposed by FLIP-375.
>
> The flamegraph proposed by FLIP-375 works on process level, such as
> JobManager or TaskManager, so it can monitor all threads. Such as:
> rocksdb background threads.
>
> When the CPU usage of one TM is high, and all tasks are not busy.
> The new flamegraph will be useful.
>
> Back to the question: It includes task or operator thread,
> why the current flamegraph is still needed?
>
> 1. The flamegraph of process level cannot easily distinguish tasks.
> Especially if there are multiple slots in a TM, and different subtasks of
> the
> same task running in multiple slots, their stacks are very similar.
>
> 2. The Mixed-Type of current flamegraph may not be replaced by the
> process-level flame graph.
>
> Please correct me if anything is wrong, thanks~
>
> Hi Yu,
>
> > Jobmanager allows the user to download the results of the corresponding
> files on taskmanager with the blob service.
>
> Are all process-level flamegraphs stored at BlobStore?
> Are they maintained by JobManager after sampling?
> Is there cleanup strategy? Or max-save-count strategy?
>
> Best,
> Rui
>
>
> On Tue, Oct 10, 2023 at 1:24 AM Jing Ge 
> wrote:
>
> > Hi Yu, Hi Yun,
> >
> > Brilliant idea! People are keen to use it. Thanks for your proposal! I
> was
> > wondering:
> >
> > 1. will