[jira] [Closed] (FLINK-33863) Compressed Operator state restore failed

2023-12-26 Thread Yanfei Lei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanfei Lei closed FLINK-33863.
--

> Compressed Operator state restore failed
> 
>
> Key: FLINK-33863
> URL: https://issues.apache.org/jira/browse/FLINK-33863
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Ruibin Xing
>Assignee: Ruibin Xing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> We encountered an issue when using Flink 1.18.0. Our job enabled Snapshot 
> Compression and used multiple operator states and broadcast states in an 
> operator. When recovering Operator State from a Savepoint, the following 
> error occurred: "org.xerial.snappy.SnappyFramedInputStream: encountered EOF 
> while reading stream header."
> After researching, I believe the error is due to Flink 1.18.0's support for 
> Snapshot Compression on Operator State (see 
> https://issues.apache.org/jira/browse/FLINK-30113 ). When writing a 
> Savepoint, SnappyFramedInputStream adds a header to the beginning of the 
> data. When recovering Operator State from a Savepoint, 
> SnappyFramedInputStream verifies the header from the beginning of the data.
> Currently, when recovering Operator State with Snapshot Compression enabled, 
> the logic is as follows:
> For each OperatorStateHandle:
> 1. Verify if the current Savepoint stream's offset is the Snappy header.
> 2. Seek to the state's start offset.
> 3. Read the state's data and finally seek to the state's end offset.
> (See: 
> [https://github.com/apache/flink/blob/ef2b626d67147797e992ec3b338bafdb4e5ab1c7/flink-runtime/src/main/java/org/apache/flink/runtime/state/OperatorStateRestoreOperation.java#L172]
>  )
> Furthermore, when there are multiple Operator States, they are not sorted 
> according to the Operator State's offset. The broadcast states will always be 
> written to the end of the savepoint. However when reading from savepoint, 
> there are no guarantee that broadcast states will be read at last.
> Therefore, if the Operator States are out of order and the final offset is 
> recovered first, the Savepoint stream will be seeked to the end, resulting in 
> an EOF error.
> I propose a solution: sort the OperatorStateHandle by offset and then recover 
> the Operator State in order. After testing, this approach resolves the issue.
> I will submit a PR. This is my first time contributing code, so any help is 
> really appreciated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33863) Compressed Operator state restore failed

2023-12-26 Thread Yanfei Lei (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800702#comment-17800702
 ] 

Yanfei Lei commented on FLINK-33863:


Merged via 
[d415d93bbf9620ba985136469107edd8c6e31cc6|https://github.com/apache/flink/commit/d415d93bbf9620ba985136469107edd8c6e31cc6]
 

> Compressed Operator state restore failed
> 
>
> Key: FLINK-33863
> URL: https://issues.apache.org/jira/browse/FLINK-33863
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Ruibin Xing
>Assignee: Ruibin Xing
>Priority: Major
>  Labels: pull-request-available
>
> We encountered an issue when using Flink 1.18.0. Our job enabled Snapshot 
> Compression and used multiple operator states and broadcast states in an 
> operator. When recovering Operator State from a Savepoint, the following 
> error occurred: "org.xerial.snappy.SnappyFramedInputStream: encountered EOF 
> while reading stream header."
> After researching, I believe the error is due to Flink 1.18.0's support for 
> Snapshot Compression on Operator State (see 
> https://issues.apache.org/jira/browse/FLINK-30113 ). When writing a 
> Savepoint, SnappyFramedInputStream adds a header to the beginning of the 
> data. When recovering Operator State from a Savepoint, 
> SnappyFramedInputStream verifies the header from the beginning of the data.
> Currently, when recovering Operator State with Snapshot Compression enabled, 
> the logic is as follows:
> For each OperatorStateHandle:
> 1. Verify if the current Savepoint stream's offset is the Snappy header.
> 2. Seek to the state's start offset.
> 3. Read the state's data and finally seek to the state's end offset.
> (See: 
> [https://github.com/apache/flink/blob/ef2b626d67147797e992ec3b338bafdb4e5ab1c7/flink-runtime/src/main/java/org/apache/flink/runtime/state/OperatorStateRestoreOperation.java#L172]
>  )
> Furthermore, when there are multiple Operator States, they are not sorted 
> according to the Operator State's offset. The broadcast states will always be 
> written to the end of the savepoint. However when reading from savepoint, 
> there are no guarantee that broadcast states will be read at last.
> Therefore, if the Operator States are out of order and the final offset is 
> recovered first, the Savepoint stream will be seeked to the end, resulting in 
> an EOF error.
> I propose a solution: sort the OperatorStateHandle by offset and then recover 
> the Operator State in order. After testing, this approach resolves the issue.
> I will submit a PR. This is my first time contributing code, so any help is 
> really appreciated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-33863) Compressed Operator state restore failed

2023-12-26 Thread Yanfei Lei (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanfei Lei resolved FLINK-33863.

Fix Version/s: 1.19.0
   Resolution: Fixed

> Compressed Operator state restore failed
> 
>
> Key: FLINK-33863
> URL: https://issues.apache.org/jira/browse/FLINK-33863
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Ruibin Xing
>Assignee: Ruibin Xing
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> We encountered an issue when using Flink 1.18.0. Our job enabled Snapshot 
> Compression and used multiple operator states and broadcast states in an 
> operator. When recovering Operator State from a Savepoint, the following 
> error occurred: "org.xerial.snappy.SnappyFramedInputStream: encountered EOF 
> while reading stream header."
> After researching, I believe the error is due to Flink 1.18.0's support for 
> Snapshot Compression on Operator State (see 
> https://issues.apache.org/jira/browse/FLINK-30113 ). When writing a 
> Savepoint, SnappyFramedInputStream adds a header to the beginning of the 
> data. When recovering Operator State from a Savepoint, 
> SnappyFramedInputStream verifies the header from the beginning of the data.
> Currently, when recovering Operator State with Snapshot Compression enabled, 
> the logic is as follows:
> For each OperatorStateHandle:
> 1. Verify if the current Savepoint stream's offset is the Snappy header.
> 2. Seek to the state's start offset.
> 3. Read the state's data and finally seek to the state's end offset.
> (See: 
> [https://github.com/apache/flink/blob/ef2b626d67147797e992ec3b338bafdb4e5ab1c7/flink-runtime/src/main/java/org/apache/flink/runtime/state/OperatorStateRestoreOperation.java#L172]
>  )
> Furthermore, when there are multiple Operator States, they are not sorted 
> according to the Operator State's offset. The broadcast states will always be 
> written to the end of the savepoint. However when reading from savepoint, 
> there are no guarantee that broadcast states will be read at last.
> Therefore, if the Operator States are out of order and the final offset is 
> recovered first, the Savepoint stream will be seeked to the end, resulting in 
> an EOF error.
> I propose a solution: sort the OperatorStateHandle by offset and then recover 
> the Operator State in order. After testing, this approach resolves the issue.
> I will submit a PR. This is my first time contributing code, so any help is 
> really appreciated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33938) Correct implicit coercions in relational operators to adopt typescript 5.0

2023-12-26 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song reassigned FLINK-33938:


Assignee: Ao Yuchen

> Correct implicit coercions in relational operators to adopt typescript 5.0
> --
>
> Key: FLINK-33938
> URL: https://issues.apache.org/jira/browse/FLINK-33938
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Ao Yuchen
>Assignee: Ao Yuchen
>Priority: Major
> Fix For: 1.19.0, 1.17.3, 1.18.2
>
>
> Since TypeScript 5.0, there is a break change that implicit coercions in 
> relational operators are forbidden [1].
> So that the following code in 
> flink-runtime-web/web-dashboard/src/app/components/humanize-date.pipe.ts get 
> error:
> {code:java}
> public transform(
>   value: number | string | Date,
>   ...
> ): string | null | undefined {
>   if (value == null || value === '' || value !== value || value < 0) {
> return '-';
>   } 
>   ...
> }{code}
> The correctness improvement is availble in here 
> [2][.|https://github.com/microsoft/TypeScript/pull/52048.]
> I think we should optimize this type of code for better compatibility.
>  
> [1] 
> [https://devblogs.microsoft.com/typescript/announcing-typescript-5-0/#forbidden-implicit-coercions-in-relational-operators]
> [2] 
> [https://github.com/microsoft/TypeScript/pull/52048|https://github.com/microsoft/TypeScript/pull/52048.]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-33884] Update Pulsar dependency to 3.0.2 in Pulsar Connector [flink-connector-pulsar]

2023-12-26 Thread via GitHub


syhily commented on PR #72:
URL: 
https://github.com/apache/flink-connector-pulsar/pull/72#issuecomment-1870006920

   The license file version should also be bumped in the mean time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-33009) tools/release/update_japicmp_configuration.sh should only enable binary compatibility checks in the release branch

2023-12-26 Thread Wencong Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800696#comment-17800696
 ] 

Wencong Liu edited comment on FLINK-33009 at 12/27/23 6:43 AM:
---

Hi [~mapohl] , I've encountered the same issue once more in FLINK-33949 when 
I'm making some code changes considered binary incompatible by japicmp. I'd 
like to take this ticket and fix it. WDYT?


was (Author: JIRAUSER281639):
Hi [~mapohl] , I've encountered the same issue once more in [FLINK-33949] 
METHOD_ABSTRACT_NOW_DEFAULT should be both source compatible and binary 
compatible - ASF JIRA (apache.org)when I'm making some code changes considered 
binary incompatible by japicmp. I'd like to take this ticket and fix it. WDYT?

> tools/release/update_japicmp_configuration.sh should only enable binary 
> compatibility checks in the release branch
> --
>
> Key: FLINK-33009
> URL: https://issues.apache.org/jira/browse/FLINK-33009
> Project: Flink
>  Issue Type: Bug
>  Components: Release System
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Priority: Major
>
> According to Flink's API compatibility constraints, we only support binary 
> compatibility between versions. In 
> [apache-flink:pom.xml:2246|https://github.com/apache/flink/blob/aa8d93ea239f5be79066b7e5caad08d966c86ab2/pom.xml#L2246]
>  we have binary compatibility enabled even in {{master}}. This doesn't comply 
> with the rules. We should this flag disabled in {{master}}. The 
> {{tools/release/update_japicmp_configuration.sh}} should enable this flag in 
> the release branch as part of the release process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33009) tools/release/update_japicmp_configuration.sh should only enable binary compatibility checks in the release branch

2023-12-26 Thread Wencong Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800696#comment-17800696
 ] 

Wencong Liu edited comment on FLINK-33009 at 12/27/23 6:42 AM:
---

Hi [~mapohl] , I've encountered the same issue once more in [FLINK-33949] 
METHOD_ABSTRACT_NOW_DEFAULT should be both source compatible and binary 
compatible - ASF JIRA (apache.org)when I'm making some code changes considered 
binary incompatible by japicmp. I'd like to take this ticket and fix it. WDYT?


was (Author: JIRAUSER281639):
Hi [~mapohl] , I've encountered the same issue once more when I'm making some 
code changes considered binary incompatible by japicmp. I'd like to take this 
ticket and fix it. WDYT?

> tools/release/update_japicmp_configuration.sh should only enable binary 
> compatibility checks in the release branch
> --
>
> Key: FLINK-33009
> URL: https://issues.apache.org/jira/browse/FLINK-33009
> Project: Flink
>  Issue Type: Bug
>  Components: Release System
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Priority: Major
>
> According to Flink's API compatibility constraints, we only support binary 
> compatibility between versions. In 
> [apache-flink:pom.xml:2246|https://github.com/apache/flink/blob/aa8d93ea239f5be79066b7e5caad08d966c86ab2/pom.xml#L2246]
>  we have binary compatibility enabled even in {{master}}. This doesn't comply 
> with the rules. We should this flag disabled in {{master}}. The 
> {{tools/release/update_japicmp_configuration.sh}} should enable this flag in 
> the release branch as part of the release process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-33681] Reuse input/output metrics of SourceOperator/SinkWriterOperator for task [flink]

2023-12-26 Thread via GitHub


flinkbot commented on PR #23998:
URL: https://github.com/apache/flink/pull/23998#issuecomment-1869997869

   
   ## CI report:
   
   * a99046290254b2056258bd8023fc3c89edff3735 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-33682) Reuse source operator input records/bytes metrics for SourceOperatorStreamTask

2023-12-26 Thread Zhanghao Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanghao Chen closed FLINK-33682.
-
Resolution: Abandoned

> Reuse source operator input records/bytes metrics for SourceOperatorStreamTask
> --
>
> Key: FLINK-33682
> URL: https://issues.apache.org/jira/browse/FLINK-33682
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Metrics
>Reporter: Zhanghao Chen
>Assignee: Zhanghao Chen
>Priority: Major
>  Labels: pull-request-available
>
> For SourceOperatorStreamTask, source opeartor is the head operator that takes 
> input. We can directly reuse source operator input records/bytes metrics for 
> it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33682) Reuse source operator input records/bytes metrics for SourceOperatorStreamTask

2023-12-26 Thread Zhanghao Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800697#comment-17800697
 ] 

Zhanghao Chen commented on FLINK-33682:
---

Closing this ticket as we found a better unified implementation for it. See 
[FLINK-33681|https://issues.apache.org/jira/browse/FLINK-33681]

> Reuse source operator input records/bytes metrics for SourceOperatorStreamTask
> --
>
> Key: FLINK-33682
> URL: https://issues.apache.org/jira/browse/FLINK-33682
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Metrics
>Reporter: Zhanghao Chen
>Assignee: Zhanghao Chen
>Priority: Major
>  Labels: pull-request-available
>
> For SourceOperatorStreamTask, source opeartor is the head operator that takes 
> input. We can directly reuse source operator input records/bytes metrics for 
> it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33681) Display source/sink numRecordsIn/Out & numBytesIn/Out on UI

2023-12-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33681:
---
Labels: pull-request-available  (was: )

> Display source/sink numRecordsIn/Out & numBytesIn/Out on UI
> ---
>
> Key: FLINK-33681
> URL: https://issues.apache.org/jira/browse/FLINK-33681
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics
>Affects Versions: 1.18.0, 1.17.2
>Reporter: Zhanghao Chen
>Priority: Major
>  Labels: pull-request-available
> Attachments: screenshot-20231224-195605-1.png, 
> screenshot-20231225-120421.png
>
>
> Currently, the numRecordsIn & numBytesIn metrics for sources and the 
> numRecordsOut & numBytesOut metrics for sinks are always 0 on the Flink web 
> dashboard.
> FLINK-11576 brings us these metrics on the opeartor level, but it does not 
> integrate them on the task level. On the other hand, the summay metrics on 
> the job overview page is based on the task level I/O metrics. As a result, 
> even though new connectors supporting FLIP-33 metrics will report 
> operator-level I/O metrics, we still cannot see the metrics on dashboard.
> This ticket serves as an umbrella issue to integrate standard source/sink I/O 
> metrics with the corresponding task I/O metrics. 
> !screenshot-20231224-195605-1.png|width=608,height=333!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [FLINK-33681] Reuse input/output metrics of SourceOperator/SinkWriterOperator for task [flink]

2023-12-26 Thread via GitHub


X-czh opened a new pull request, #23998:
URL: https://github.com/apache/flink/pull/23998

   
   
   ## What is the purpose of the change
   
   Currently, the numRecordsIn & numBytesIn metrics for sources and the 
numRecordsOut & numBytesOut metrics for sinks are always 0 on the Flink web 
dashboard.
   
   [FLINK-11576](https://issues.apache.org/jira/browse/FLINK-11576) brings us 
these metrics on the opeartor level, but it does not integrate them on the task 
level. On the other hand, the summay metrics on the job overview page is based 
on the task level I/O metrics. As a result, even though new connectors 
supporting FLIP-33 metrics will report operator-level I/O metrics, we still 
cannot see the metrics on dashboard.
   
   This MR attempts to reuse the operator-level source/sink I/O metrics for 
task so that they can be viewed on Flink web dashboard. 
   
   ## Brief change log
   
   Reuse input/output metrics of SourceOperator/SinkWriterOperator for task. 
   
   Since Flink only accounts for internal traffic for input/output bytes 
metrics before, the reuse won't cause duplication in the I/O bytes metrics. 
Also, as the output records metric is intentionally dropped for 
SinkWriterOperator in `OperatorChain#getOperatorRecordsOutCounter` and no input 
records metric is collected for `SourceOperatorStreamTask`, no duplication in 
the I/O records metrics will take place.
   
   ## Verifying this change
   
   Manually run Kafka2Kafka job on a testing cluster on K8s, verified that the 
source/sink input/output metrics can be seen on the web dashboard.
   
   
![image](https://github.com/apache/flink/assets/22020529/4ed94730-cf21-4451-b662-de38faef87e2)
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33009) tools/release/update_japicmp_configuration.sh should only enable binary compatibility checks in the release branch

2023-12-26 Thread Wencong Liu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800696#comment-17800696
 ] 

Wencong Liu commented on FLINK-33009:
-

Hi [~mapohl] , I've encountered the same issue once more when I'm making some 
code changes considered binary incompatible by japicmp. I'd like to take this 
ticket and fix it. WDYT?

> tools/release/update_japicmp_configuration.sh should only enable binary 
> compatibility checks in the release branch
> --
>
> Key: FLINK-33009
> URL: https://issues.apache.org/jira/browse/FLINK-33009
> Project: Flink
>  Issue Type: Bug
>  Components: Release System
>Affects Versions: 1.19.0
>Reporter: Matthias Pohl
>Priority: Major
>
> According to Flink's API compatibility constraints, we only support binary 
> compatibility between versions. In 
> [apache-flink:pom.xml:2246|https://github.com/apache/flink/blob/aa8d93ea239f5be79066b7e5caad08d966c86ab2/pom.xml#L2246]
>  we have binary compatibility enabled even in {{master}}. This doesn't comply 
> with the rules. We should this flag disabled in {{master}}. The 
> {{tools/release/update_japicmp_configuration.sh}} should enable this flag in 
> the release branch as part of the release process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33949) METHOD_ABSTRACT_NOW_DEFAULT should be both source compatible and binary compatible

2023-12-26 Thread Wencong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wencong Liu updated FLINK-33949:

Description: 
Currently  I'm trying to refactor some APIs annotated by @Public in [FLIP-382: 
Unify the Provision of Diverse Metadata for Context-like APIs - Apache Flink - 
Apache Software 
Foundation|https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs].
 When an abstract method is changed into a default method, the japicmp maven 
plugin names this change METHOD_ABSTRACT_NOW_DEFAULT and considers it as source 
incompatible and binary incompatible.

The reason maybe that if the abstract method becomes default, the logic in the 
default method will be ignored by the previous implementations.

I create a test case in which a job is compiled with newly changed default 
method and submitted to the previous version. There is no exception thrown. 
Therefore, the METHOD_ABSTRACT_NOW_DEFAULT shouldn't be incompatible both for 
source and binary. We could add the following settings to override the default 
values for binary and source compatibility, such as:
{code:java}


   METHOD_ABSTRACT_NOW_DEFAULT
   true
   true

 {code}
By the way, currently the master branch checks both source compatibility and 
binary compatibility between minor versions. According to Flink's API 
compatibility constraints, the master branch shouldn't check binary 
compatibility. There is already jira FLINK-33009 to track it and we should fix 
it as soon as possible.

 

 

 

  was:
Currently  I'm trying to refactor some APIs annotated by @Public in [FLIP-382: 
Unify the Provision of Diverse Metadata for Context-like APIs - Apache Flink - 
Apache Software 
Foundation|https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs].
 When an abstract method is changed into a default method, the japicmp maven 
plugin names this change METHOD_ABSTRACT_NOW_DEFAULT and considers it as source 
incompatible and binary incompatible.

The reason maybe that if the abstract method becomes default, the logic in the 
default method will be ignored by the previous implementations.

I create a test case in which a job is compiled with newly changed default 
method and submitted to the previous version. There is no exception thrown. 
Therefore, the METHOD_ABSTRACT_NOW_DEFAULT shouldn't be incompatible both for 
source and binary.

By the way, currently the master branch checks both source compatibility and 
binary compatibility between minor versions. According to Flink's API 
compatibility constraints, the master branch shouldn't check binary 
compatibility. There is already jira FLINK-33009 to track it and we should fix 
it as soon as possible.

 

 

 


> METHOD_ABSTRACT_NOW_DEFAULT should be both source compatible and binary 
> compatible
> --
>
> Key: FLINK-33949
> URL: https://issues.apache.org/jira/browse/FLINK-33949
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Affects Versions: 1.19.0
>Reporter: Wencong Liu
>Priority: Major
> Fix For: 1.19.0
>
>
> Currently  I'm trying to refactor some APIs annotated by @Public in 
> [FLIP-382: Unify the Provision of Diverse Metadata for Context-like APIs - 
> Apache Flink - Apache Software 
> Foundation|https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs].
>  When an abstract method is changed into a default method, the japicmp maven 
> plugin names this change METHOD_ABSTRACT_NOW_DEFAULT and considers it as 
> source incompatible and binary incompatible.
> The reason maybe that if the abstract method becomes default, the logic in 
> the default method will be ignored by the previous implementations.
> I create a test case in which a job is compiled with newly changed default 
> method and submitted to the previous version. There is no exception thrown. 
> Therefore, the METHOD_ABSTRACT_NOW_DEFAULT shouldn't be incompatible both for 
> source and binary. We could add the following settings to override the 
> default values for binary and source compatibility, such as:
> {code:java}
> 
> 
>METHOD_ABSTRACT_NOW_DEFAULT
>true
>true
> 
>  {code}
> By the way, currently the master branch checks both source compatibility and 
> binary compatibility between minor versions. According to Flink's API 
> compatibility constraints, the master branch shouldn't check binary 
> compatibility. There is already jira FLINK-33009 to track it and we should 
> fix it as soon as possible.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33949) METHOD_ABSTRACT_NOW_DEFAULT should be both source compatible and binary compatible

2023-12-26 Thread Wencong Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wencong Liu updated FLINK-33949:

Description: 
Currently  I'm trying to refactor some APIs annotated by @Public in [FLIP-382: 
Unify the Provision of Diverse Metadata for Context-like APIs - Apache Flink - 
Apache Software 
Foundation|https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs].
 When an abstract method is changed into a default method, the japicmp maven 
plugin names this change METHOD_ABSTRACT_NOW_DEFAULT and considers it as source 
incompatible and binary incompatible.

The reason maybe that if the abstract method becomes default, the logic in the 
default method will be ignored by the previous implementations.

I create a test case in which a job is compiled with newly changed default 
method and submitted to the previous version. There is no exception thrown. 
Therefore, the METHOD_ABSTRACT_NOW_DEFAULT shouldn't be incompatible both for 
source and binary.

By the way, currently the master branch checks both source compatibility and 
binary compatibility between minor versions. According to Flink's API 
compatibility constraints, the master branch shouldn't check binary 
compatibility. There is already jira FLINK-33009 to track it and we should fix 
it as soon as possible.

 

 

 

  was:
Currently  I'm trying to refactor some APIs annotated by @Public in [FLIP-382: 
Unify the Provision of Diverse Metadata for Context-like APIs - Apache Flink - 
Apache Software 
Foundation|https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs].
 When an abstract method is changed into a default method, the japicmp maven 
plugin names this change METHOD_ABSTRACT_NOW_DEFAULT and considers it as source 
incompatible and binary incompatible.

The reason maybe that if the abstract method becomes default, the logic in the 
default method will be ignored by the previous implementations.

I create a test case in which a job is compiled with newly changed default 
method and submitted to the previous version. There is no exception thrown. 
Therefore, the METHOD_ABSTRACT_NOW_DEFAULT shouldn't be incompatible both for 
source and binary.

By the way, currently the master branch checks both source compatibility and 
binary compatibility between minor versions. According to Flink's API 
compatibility constraints, the master branch shouldn't check binary 
compatibility. There is already a [Jira|[FLINK-33009] 
tools/release/update_japicmp_configuration.sh should only enable binary 
compatibility checks in the release branch - ASF JIRA (apache.org)] to track it 
and we should fix it as soon as possible.

 

 

 


> METHOD_ABSTRACT_NOW_DEFAULT should be both source compatible and binary 
> compatible
> --
>
> Key: FLINK-33949
> URL: https://issues.apache.org/jira/browse/FLINK-33949
> Project: Flink
>  Issue Type: Bug
>  Components: Test Infrastructure
>Affects Versions: 1.19.0
>Reporter: Wencong Liu
>Priority: Major
> Fix For: 1.19.0
>
>
> Currently  I'm trying to refactor some APIs annotated by @Public in 
> [FLIP-382: Unify the Provision of Diverse Metadata for Context-like APIs - 
> Apache Flink - Apache Software 
> Foundation|https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs].
>  When an abstract method is changed into a default method, the japicmp maven 
> plugin names this change METHOD_ABSTRACT_NOW_DEFAULT and considers it as 
> source incompatible and binary incompatible.
> The reason maybe that if the abstract method becomes default, the logic in 
> the default method will be ignored by the previous implementations.
> I create a test case in which a job is compiled with newly changed default 
> method and submitted to the previous version. There is no exception thrown. 
> Therefore, the METHOD_ABSTRACT_NOW_DEFAULT shouldn't be incompatible both for 
> source and binary.
> By the way, currently the master branch checks both source compatibility and 
> binary compatibility between minor versions. According to Flink's API 
> compatibility constraints, the master branch shouldn't check binary 
> compatibility. There is already jira FLINK-33009 to track it and we should 
> fix it as soon as possible.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33949) METHOD_ABSTRACT_NOW_DEFAULT should be both source compatible and binary compatible

2023-12-26 Thread Wencong Liu (Jira)
Wencong Liu created FLINK-33949:
---

 Summary: METHOD_ABSTRACT_NOW_DEFAULT should be both source 
compatible and binary compatible
 Key: FLINK-33949
 URL: https://issues.apache.org/jira/browse/FLINK-33949
 Project: Flink
  Issue Type: Bug
  Components: Test Infrastructure
Affects Versions: 1.19.0
Reporter: Wencong Liu
 Fix For: 1.19.0


Currently  I'm trying to refactor some APIs annotated by @Public in [FLIP-382: 
Unify the Provision of Diverse Metadata for Context-like APIs - Apache Flink - 
Apache Software 
Foundation|https://cwiki.apache.org/confluence/display/FLINK/FLIP-382%3A+Unify+the+Provision+of+Diverse+Metadata+for+Context-like+APIs].
 When an abstract method is changed into a default method, the japicmp maven 
plugin names this change METHOD_ABSTRACT_NOW_DEFAULT and considers it as source 
incompatible and binary incompatible.

The reason maybe that if the abstract method becomes default, the logic in the 
default method will be ignored by the previous implementations.

I create a test case in which a job is compiled with newly changed default 
method and submitted to the previous version. There is no exception thrown. 
Therefore, the METHOD_ABSTRACT_NOW_DEFAULT shouldn't be incompatible both for 
source and binary.

By the way, currently the master branch checks both source compatibility and 
binary compatibility between minor versions. According to Flink's API 
compatibility constraints, the master branch shouldn't check binary 
compatibility. There is already a [Jira|[FLINK-33009] 
tools/release/update_japicmp_configuration.sh should only enable binary 
compatibility checks in the release branch - ASF JIRA (apache.org)] to track it 
and we should fix it as soon as possible.

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800694#comment-17800694
 ] 

Vijay commented on FLINK-33943:
---

Thanks [~wanglijie] for your inputs.

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33944) Apache Flink: Process to restore more than one job on job manager startup from the respective savepoints

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800692#comment-17800692
 ] 

Vijay commented on FLINK-33944:
---

[~wanglijie] Do you have any input on this information request for savepoint 
restore process for multiple jobs (via Java Client) or Job-manager startup (via 
standalone-job.sh or jobmanager.sh). "standalone-job.sh" supports only one job 
to be restore from savepoint on Jobmanager startup.

> Apache Flink: Process to restore more than one job on job manager startup 
> from the respective savepoints
> 
>
> Key: FLINK-33944
> URL: https://issues.apache.org/jira/browse/FLINK-33944
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.18.0
>Reporter: Vijay
>Priority: Major
>
>  
> We are using Flink (1.18) version for our Flink cluster. The job manager has 
> been deployed in "Application mode" and we are looking for a process to 
> restore multiple jobs (using their respective savepoint directories) when the 
> job manager is started. Currently, we have the option to restore only one job 
> while running "standalone-job.sh" using the --fromSavepoint and 
> --allowNonRestoredState. However, we need a way to trigger multiple job 
> executions via Java client (from its respective savepoint location) on 
> Jobmanager startup.
> Note: We are not using a Kubernetes native deployment, but we are using k8s 
> standalone mode of deployment.
> Additional Query: If there is a process to restore multiple jobs from its 
> respective savepoints on "Application mode" of deployment, is the same 
> supported on Session mode of deployment or not?
> *Expected process:*
>  # Before starting with the Flink/application image upgrade, trigger the 
> savepoints for all the current running jobs.
>  # Once the savepoints process completed for all jobs, will trigger the scale 
> down of job manager and task manager instances.
>  # Update the image version on the k8s deployment with the update application 
> image.
>  # After image version is updated, scale up the job manager and task manager.
>  # We need a process to restore the previously running jobs from the 
> savepoint dir and start all the jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-33945][table] Cleanup usage of deprecated org.apache.flink.tab… [flink]

2023-12-26 Thread via GitHub


flinkbot commented on PR #23997:
URL: https://github.com/apache/flink/pull/23997#issuecomment-1869927270

   
   ## CI report:
   
   * 9c4b58a6a3e15d468b983b97f27f8266bcfb3036 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-30535] Introduce TTL state based benchmarks [flink-benchmarks]

2023-12-26 Thread via GitHub


Zakelly commented on PR #83:
URL: https://github.com/apache/flink-benchmarks/pull/83#issuecomment-1869926898

   @Myasuka I fixed a compile error and now it's OK. Maybe this repo needs a CI 
process validating the compilation.
   Please let me know if you have any other concern on this PR. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33948][table] Cleanup usage of deprecated org.apache.flink.tab… [flink]

2023-12-26 Thread via GitHub


flinkbot commented on PR #23996:
URL: https://github.com/apache/flink/pull/23996#issuecomment-1869925513

   
   ## CI report:
   
   * 8b708faea30bb078f8114bc3faf9e46e7b6ae775 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33945) Cleanup usage of deprecated org.apache.flink.table.api.dataview.ListView#ListView

2023-12-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33945:
---
Labels: pull-request-available  (was: )

> Cleanup usage of deprecated 
> org.apache.flink.table.api.dataview.ListView#ListView
> -
>
> Key: FLINK-33945
> URL: https://issues.apache.org/jira/browse/FLINK-33945
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: Jacky Lau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [FLINK-33945][table] Cleanup usage of deprecated org.apache.flink.tab… [flink]

2023-12-26 Thread via GitHub


liuyongvs opened a new pull request, #23997:
URL: https://github.com/apache/flink/pull/23997

   
   
   ## What is the purpose of the change
   
   *Cleanup usage of deprecated Cleanup usage of deprecated 
org.apache.flink.table.api.dataview.ListView#ListView*
   
   
   
   ## Brief change log
   
   * Cleanup usage of deprecated Cleanup usage of deprecated 
org.apache.flink.table.api.dataview.ListView#ListView*
   
   
   ## Verifying this change
   * original unit test *
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no )
 - The runtime per-record code paths (performance sensitive): (no )
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33948) Cleanup usage of org.apache.flink.table.api.dataview.MapView#MapView

2023-12-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33948:
---
Labels: pull-request-available  (was: )

> Cleanup usage of org.apache.flink.table.api.dataview.MapView#MapView
> 
>
> Key: FLINK-33948
> URL: https://issues.apache.org/jira/browse/FLINK-33948
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.19.0
>Reporter: Jacky Lau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [FLINK-33948][table] Cleanup usage of deprecated org.apache.flink.tab… [flink]

2023-12-26 Thread via GitHub


liuyongvs opened a new pull request, #23996:
URL: https://github.com/apache/flink/pull/23996

   …le.api.dataview.MapView#MapView
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-33948) Cleanup usage of org.apache.flink.table.api.dataview.MapView#MapView

2023-12-26 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33948:
-

 Summary: Cleanup usage of 
org.apache.flink.table.api.dataview.MapView#MapView
 Key: FLINK-33948
 URL: https://issues.apache.org/jira/browse/FLINK-33948
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33944) Apache Flink: Process to restore more than one job on job manager startup from the respective savepoints

2023-12-26 Thread Vijay (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated FLINK-33944:
--
Description: 
 
We are using Flink (1.18) version for our Flink cluster. The job manager has 
been deployed in "Application mode" and we are looking for a process to restore 
multiple jobs (using their respective savepoint directories) when the job 
manager is started. Currently, we have the option to restore only one job while 
running "standalone-job.sh" using the --fromSavepoint and 
--allowNonRestoredState. However, we need a way to trigger multiple job 
executions via Java client (from its respective savepoint location) on 
Jobmanager startup.

Note: We are not using a Kubernetes native deployment, but we are using k8s 
standalone mode of deployment.

Additional Query: If there is a process to restore multiple jobs from its 
respective savepoints on "Application mode" of deployment, is the same 
supported on Session mode of deployment or not?

*Expected process:*
 # Before starting with the Flink/application image upgrade, trigger the 
savepoints for all the current running jobs.
 # Once the savepoints process completed for all jobs, will trigger the scale 
down of job manager and task manager instances.
 # Update the image version on the k8s deployment with the update application 
image.
 # After image version is updated, scale up the job manager and task manager.
 # We need a process to restore the previously running jobs from the savepoint 
dir and start all the jobs.

  was:
 
We are using Flink (1.18) version for our Flink cluster. The job manager has 
been deployed in "Application mode" and we are looking for a process to restore 
multiple jobs (using their respective savepoint directories) when the job 
manager is started. Currently, we have the option to restore only one job while 
running "standalone-job.sh" using the --fromSavepoint and 
--allowNonRestoredState. However, we need a way to trigger multiple job 
executions via Java client.

Note: We are not using a Kubernetes native deployment, but we are using k8s 
standalone mode of deployment.

Additional Query: If there is a process to restore multiple jobs from its 
respective savepoints on "Application mode" of deployment, is the same 
supported on Session mode of deployment or not?

*Expected process:*
 # Before starting with the Flink/application image upgrade, trigger the 
savepoints for all the current running jobs.
 # Once the savepoints process completed for all jobs, will trigger the scale 
down of job manager and task manager instances.
 # Update the image version on the k8s deployment with the update application 
image.
 # After image version is updated, scale up the job manager and task manager.
 # We need a process to restore the previously running jobs from the savepoint 
dir and start all the jobs.


> Apache Flink: Process to restore more than one job on job manager startup 
> from the respective savepoints
> 
>
> Key: FLINK-33944
> URL: https://issues.apache.org/jira/browse/FLINK-33944
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.18.0
>Reporter: Vijay
>Priority: Major
>
>  
> We are using Flink (1.18) version for our Flink cluster. The job manager has 
> been deployed in "Application mode" and we are looking for a process to 
> restore multiple jobs (using their respective savepoint directories) when the 
> job manager is started. Currently, we have the option to restore only one job 
> while running "standalone-job.sh" using the --fromSavepoint and 
> --allowNonRestoredState. However, we need a way to trigger multiple job 
> executions via Java client (from its respective savepoint location) on 
> Jobmanager startup.
> Note: We are not using a Kubernetes native deployment, but we are using k8s 
> standalone mode of deployment.
> Additional Query: If there is a process to restore multiple jobs from its 
> respective savepoints on "Application mode" of deployment, is the same 
> supported on Session mode of deployment or not?
> *Expected process:*
>  # Before starting with the Flink/application image upgrade, trigger the 
> savepoints for all the current running jobs.
>  # Once the savepoints process completed for all jobs, will trigger the scale 
> down of job manager and task manager instances.
>  # Update the image version on the k8s deployment with the update application 
> image.
>  # After image version is updated, scale up the job manager and task manager.
>  # We need a process to restore the previously running jobs from the 
> savepoint dir and start all the jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33947) Fix bugs in DelegatingConfiguration missed the prefix mapping

2023-12-26 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-33947:

Fix Version/s: 1.19.0
   1.17.3
   1.18.2

> Fix bugs in DelegatingConfiguration missed the prefix mapping 
> --
>
> Key: FLINK-33947
> URL: https://issues.apache.org/jira/browse/FLINK-33947
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Reporter: RocMarshal
>Assignee: RocMarshal
>Priority: Major
> Fix For: 1.19.0, 1.17.3, 1.18.2
>
>
> It was resulted from 
> [https://github.com/apache/flink/pull/23994#issuecomment-1869905090] 
> -  Check and confirm other potential bug points
> -  Fix the bugs about prefix key mapping when operating.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33947) Fix bugs in DelegatingConfiguration missed the prefix mapping

2023-12-26 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan reassigned FLINK-33947:
---

Assignee: RocMarshal

> Fix bugs in DelegatingConfiguration missed the prefix mapping 
> --
>
> Key: FLINK-33947
> URL: https://issues.apache.org/jira/browse/FLINK-33947
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Reporter: RocMarshal
>Assignee: RocMarshal
>Priority: Major
>
> It was resulted from 
> [https://github.com/apache/flink/pull/23994#issuecomment-1869905090] 
> -  Check and confirm other potential bug points
> -  Fix the bugs about prefix key mapping when operating.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33856) Add metrics to monitor the interaction performance between task and external storage system in the process of checkpoint making

2023-12-26 Thread Hangxiang Yu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800683#comment-17800683
 ] 

Hangxiang Yu commented on FLINK-33856:
--

[~Weijie Guo] Thanks for pinging me here.

[~hejufang001]

Thanks for the proposal.

I think these metrics sound reasonable.

IIUC, they are checkpoint related task-level metrics.

I think we could use TraceReporter provided by FLINK-33695 but not use current 
MetricReporter as you could see the reaon mentioned in FLINK-33695

cc [~pnowojski] 

> Add metrics to monitor the interaction performance between task and external 
> storage system in the process of checkpoint making
> ---
>
> Key: FLINK-33856
> URL: https://issues.apache.org/jira/browse/FLINK-33856
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.18.0
>Reporter: Jufang He
>Assignee: Jufang He
>Priority: Major
>  Labels: pull-request-available
>
> When Flink makes a checkpoint, the interaction performance with the external 
> file system has a great impact on the overall time-consuming. Therefore, it 
> is easy to observe the bottleneck point by adding performance indicators when 
> the task interacts with the external file storage system. These include: the 
> rate of file write , the latency to write the file, the latency to close the 
> file.
> In flink side add the above metrics has the following advantages: convenient 
> statistical different task E2E time-consuming; do not need to distinguish the 
> type of external storage system, can be unified in the 
> FsCheckpointStreamFactory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-33942][configuration] Fix the bug that DelegatingConfiguration misses the prefix in some get methods [flink]

2023-12-26 Thread via GitHub


RocMarshal commented on PR #23994:
URL: https://github.com/apache/flink/pull/23994#issuecomment-1869917144

   > It's indeed a bug. The removeConfig and removeKey of 
DelegatingConfiguration missed the prefix as well. I didn't notice them in the 
beginning. Would you like to fix it? If yes, feel free to take it, and I'm glad 
to help review.
   
   Thanks you for your confirmation . my pleasure to do it( by 
https://issues.apache.org/jira/browse/FLINK-33947). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-33947) Fix bugs in DelegatingConfiguration missed the prefix mapping

2023-12-26 Thread RocMarshal (Jira)
RocMarshal created FLINK-33947:
--

 Summary: Fix bugs in DelegatingConfiguration missed the prefix 
mapping 
 Key: FLINK-33947
 URL: https://issues.apache.org/jira/browse/FLINK-33947
 Project: Flink
  Issue Type: Bug
  Components: API / Core
Reporter: RocMarshal


It was resulted from 
[https://github.com/apache/flink/pull/23994#issuecomment-1869905090] 

-  Check and confirm other potential bug points

-  Fix the bugs about prefix key mapping when operating.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33946) RocksDb sets setAvoidFlushDuringShutdown to true to speed up Task Cancel

2023-12-26 Thread Yue Ma (Jira)
Yue Ma created FLINK-33946:
--

 Summary: RocksDb sets setAvoidFlushDuringShutdown to true to speed 
up Task Cancel
 Key: FLINK-33946
 URL: https://issues.apache.org/jira/browse/FLINK-33946
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Affects Versions: 1.19.0
Reporter: Yue Ma
 Fix For: 1.19.0


When a Job fails, the task needs to be canceled and re-deployed. 
RocksDBStatebackend will call RocksDB.close when disposing.


{code:java}
if (!shutting_down_.load(std::memory_order_acquire) &&
has_unpersisted_data_.load(std::memory_order_relaxed) &&
!mutable_db_options_.avoid_flush_during_shutdown) {
  if (immutable_db_options_.atomic_flush) {
autovector cfds;
SelectColumnFamiliesForAtomicFlush();
mutex_.Unlock();
Status s =
AtomicFlushMemTables(cfds, FlushOptions(), FlushReason::kShutDown);
s.PermitUncheckedError();  //**TODO: What to do on error?
mutex_.Lock();
  } else {
for (auto cfd : *versions_->GetColumnFamilySet()) {
  if (!cfd->IsDropped() && cfd->initialized() && !cfd->mem()->IsEmpty()) {
cfd->Ref();
mutex_.Unlock();
Status s = FlushMemTable(cfd, FlushOptions(), FlushReason::kShutDown);
s.PermitUncheckedError();  //**TODO: What to do on error?
mutex_.Lock();
cfd->UnrefAndTryDelete();
  }
}
  } {code}


By default (avoid_flush_during_shutdown=false) RocksDb requires FlushMemtable 
when Close. When the disk pressure is high or the Memtable is large, this 
process will be more time-consuming, which will cause the Task to get stuck in 
the Canceling stage and affect the speed of job Failover.
In fact, it is completely unnecessary to Flush memtable when Flink Task is 
Close, because the data can be replayed from Checkpoint. So we can set 
avoid_flush_during_shutdown to true to speed up Task Failover



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33945) Cleanup usage of deprecated org.apache.flink.table.api.dataview.ListView#ListView

2023-12-26 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33945:
-

 Summary: Cleanup usage of deprecated 
org.apache.flink.table.api.dataview.ListView#ListView
 Key: FLINK-33945
 URL: https://issues.apache.org/jira/browse/FLINK-33945
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-33942][configuration] Fix the bug that DelegatingConfiguration misses the prefix in some get methods [flink]

2023-12-26 Thread via GitHub


1996fanrui commented on PR #23994:
URL: https://github.com/apache/flink/pull/23994#issuecomment-1869905090

   Thanks @RocMarshal for the quick review!

   > if we run the followed case, will be our expected results consistent with 
the current fixed results?
   > 
   > ```
   > @Test
   > void testUnknownCases() {
   > Configuration original = new Configuration();
   > final DelegatingConfiguration delegatingConf =
   > new DelegatingConfiguration(original, "prefix.");
   > 
   > // Test for integer
   > ConfigOption integerOption =
   > 
ConfigOptions.key("integer.key").intType().noDefaultValue();
   > 
   > // integerOption doesn't exist in delegatingConf, and it should be 
overrideDefault.
   > original.setInteger(integerOption, 1);
   > assertThat(delegatingConf.getInteger(integerOption, 
2)).isEqualTo(2);
   > 
   > // integerOption exists in delegatingConf, and it should be value 
that set before.
   > delegatingConf.setInteger(integerOption, 3);
   > assertThat(delegatingConf.getInteger(integerOption, 
2)).isEqualTo(3);
   > 
   > delegatingConf.removeConfig(integerOption);
   > System.out.println(delegatingConf.get(integerOption));
   > 
   > }
   > ```
   > 
   > If not, it may be a bug ?
   
   It's indeed a bug. The `removeConfig` and `removeKey` of 
`DelegatingConfiguration` missed the `prefix` as well. I didn't notice them in 
the beginning. Would you like to fix it? If yes, feel free to take it, and I'm 
glad to help review.
   
   Also, I have addressed your other comments. Please help double-check in your 
free time, thanks~
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800679#comment-17800679
 ] 

Lijie Wang commented on FLINK-33943:


Maybe you can see the comments in FLINK-19358 and 
[FLIP-85|https://cwiki.apache.org/confluence/display/FLINK/FLIP-85%3A+Flink+Application+Mode]

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-32881) Client supports making savepoints in detach mode

2023-12-26 Thread Hangxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hangxiang Yu resolved FLINK-32881.
--
Resolution: Fixed

merged into master via 81598f8a...d0dbd51c

> Client supports making savepoints in detach mode
> 
>
> Key: FLINK-32881
> URL: https://issues.apache.org/jira/browse/FLINK-32881
> Project: Flink
>  Issue Type: Improvement
>  Components: Client / Job Submission, Runtime / Checkpointing
>Affects Versions: 1.19.0
>Reporter: Renxiang Zhou
>Assignee: Renxiang Zhou
>Priority: Major
>  Labels: detach-savepoint, pull-request-available
> Fix For: 1.19.0
>
> Attachments: image-2023-08-16-17-14-34-740.png, 
> image-2023-08-16-17-14-44-212.png
>
>
> When triggering a savepoint using the command-line tool, the client needs to 
> wait for the job to finish creating the savepoint before it can exit. For 
> jobs with large state, the savepoint creation process can be time-consuming, 
> leading to the following problems:
>  # Platform users may need to manage thousands of Flink tasks on a single 
> client machine. With the current savepoint triggering mode, all savepoint 
> creation threads on that machine have to wait for the job to finish the 
> snapshot, resulting in significant resource waste;
>  # If the savepoint producing time exceeds the client's timeout duration, the 
> client will throw a timeout exception and report that the triggering 
> savepoint process fails. Since different jobs have varying savepoint 
> durations, it is difficult to adjust the timeout parameter on the client side.
> Therefore, we propose adding a detach mode to trigger savepoints on the 
> client side, just similar to the detach mode behavior when submitting jobs. 
> Here are some specific details:
>  # The savepoint UUID will be generated on the client side. After 
> successfully triggering the savepoint, the client immediately returns the 
> UUID information and exits.
>  # Add a "dump-pending-savepoints" API that allows the client to check 
> whether the triggered savepoint has been successfully created.
> By implementing these changes, the client can detach from the savepoint 
> creation process, reducing resource waste, and providing a way to check the 
> status of savepoint creation.
> !image-2023-08-16-17-14-34-740.png|width=2129,height=625!!image-2023-08-16-17-14-44-212.png|width=1530,height=445!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800677#comment-17800677
 ] 

Vijay edited comment on FLINK-33943 at 12/27/23 2:48 AM:
-

[~wanglijie]  Thanks for the prompt update. Is there a plan to support of HA 
functionality on application mode (for multiple exections) in near future 
versions? (or) is there is technical reason why its not supported currently?


was (Author: JIRAUSER303619):
[~wanglijie]  Thanks for the prompt update. Is there a plan to support of HA 
functionality on application mode (for multiple exections) be supported in near 
future versions? (or) is there is technical reason why its not supported 
currently?

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800677#comment-17800677
 ] 

Vijay edited comment on FLINK-33943 at 12/27/23 2:48 AM:
-

[~wanglijie]  Thanks for the prompt update. Is there a plan to support of HA 
functionality on application mode (for multiple exections) in near future 
versions? (or) is there a technical reasoning why its not supported currently?


was (Author: JIRAUSER303619):
[~wanglijie]  Thanks for the prompt update. Is there a plan to support of HA 
functionality on application mode (for multiple exections) in near future 
versions? (or) is there is technical reason why its not supported currently?

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800677#comment-17800677
 ] 

Vijay edited comment on FLINK-33943 at 12/27/23 2:48 AM:
-

[~wanglijie]  Thanks for the prompt update. Is there a plan to support of HA 
functionality on application mode (for multiple exections) be supported in near 
future versions? (or) is there is technical reason why its not supported 
currently?


was (Author: JIRAUSER303619):
Thanks for the prompt update. Is there a plan to support of HA functionality on 
application mode (for multiple exections) be supported in near future versions? 
(or) is there is technical reason why its not supported currently?

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-32881][checkpoint] Support triggering savepoint in detach mode for CLI and dumping all pending savepoint ids by rest api [flink]

2023-12-26 Thread via GitHub


masteryhx closed pull request #23253: [FLINK-32881][checkpoint] Support 
triggering savepoint in detach mode for CLI and dumping all pending savepoint 
ids by rest api 
URL: https://github.com/apache/flink/pull/23253


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800677#comment-17800677
 ] 

Vijay commented on FLINK-33943:
---

Thanks for the prompt update. Is there a plan to support of HA functionality on 
application mode (for multiple exections) be supported in near future versions? 
(or) is there is technical reason why its not supported currently?

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Lijie Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang closed FLINK-33943.
--
Resolution: Not A Bug

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800676#comment-17800676
 ] 

Lijie Wang commented on FLINK-33943:


I 'll close this issue because it's not a bug. cc [~vrang...@in.ibm.com] 

 

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33944) Apache Flink: Process to restore more than one job on job manager startup from the respective savepoints

2023-12-26 Thread Vijay (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated FLINK-33944:
--
Description: 
 
We are using Flink (1.18) version for our Flink cluster. The job manager has 
been deployed in "Application mode" and we are looking for a process to restore 
multiple jobs (using their respective savepoint directories) when the job 
manager is started. Currently, we have the option to restore only one job while 
running "standalone-job.sh" using the --fromSavepoint and 
--allowNonRestoredState. However, we need a way to trigger multiple job 
executions via Java client.

Note: We are not using a Kubernetes native deployment, but we are using k8s 
standalone mode of deployment.

Additional Query: If there is a process to restore multiple jobs from its 
respective savepoints on "Application mode" of deployment, is the same 
supported on Session mode of deployment or not?

*Expected process:*
 # Before starting with the Flink/application image upgrade, trigger the 
savepoints for all the current running jobs.
 # Once the savepoints process completed for all jobs, will trigger the scale 
down of job manager and task manager instances.
 # Update the image version on the k8s deployment with the update application 
image.
 # After image version is updated, scale up the job manager and task manager.
 # We need a process to restore the previously running jobs from the savepoint 
dir and start all the jobs.

  was:
 
We are using Flink (1.18) version for our Flink cluster. The job manager has 
been deployed in "Application mode" and we are looking for a process to restore 
multiple jobs (using their respective savepoint directories) when the job 
manager is started. Currently, we have the option to restore only one job while 
running "standalone-job.sh" using the --fromSavepoint and 
--allowNonRestoredState. However, we need a way to trigger multiple job 
executions via Java client.

Note: We are not using a Kubernetes native deployment, but we are using k8s 
standalone mode of deployment.

*Expected process:*
 # Before starting with the Flink/application image upgrade, trigger the 
savepoints for all the current running jobs.
 # Once the savepoints process completed for all jobs, will trigger the scale 
down of job manager and task manager instances.
 # Update the image version on the k8s deployment with the update application 
image.
 # After image version is updated, scale up the job manager and task manager.
 # We need a process to restore the previously running jobs from the savepoint 
dir and start all the jobs.


> Apache Flink: Process to restore more than one job on job manager startup 
> from the respective savepoints
> 
>
> Key: FLINK-33944
> URL: https://issues.apache.org/jira/browse/FLINK-33944
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.18.0
>Reporter: Vijay
>Priority: Major
>
>  
> We are using Flink (1.18) version for our Flink cluster. The job manager has 
> been deployed in "Application mode" and we are looking for a process to 
> restore multiple jobs (using their respective savepoint directories) when the 
> job manager is started. Currently, we have the option to restore only one job 
> while running "standalone-job.sh" using the --fromSavepoint and 
> --allowNonRestoredState. However, we need a way to trigger multiple job 
> executions via Java client.
> Note: We are not using a Kubernetes native deployment, but we are using k8s 
> standalone mode of deployment.
> Additional Query: If there is a process to restore multiple jobs from its 
> respective savepoints on "Application mode" of deployment, is the same 
> supported on Session mode of deployment or not?
> *Expected process:*
>  # Before starting with the Flink/application image upgrade, trigger the 
> savepoints for all the current running jobs.
>  # Once the savepoints process completed for all jobs, will trigger the scale 
> down of job manager and task manager instances.
>  # Update the image version on the k8s deployment with the update application 
> image.
>  # After image version is updated, scale up the job manager and task manager.
>  # We need a process to restore the previously running jobs from the 
> savepoint dir and start all the jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33944) Apache Flink: Process to restore more than one job on job manager startup from the respective savepoints

2023-12-26 Thread Vijay (Jira)
Vijay created FLINK-33944:
-

 Summary: Apache Flink: Process to restore more than one job on job 
manager startup from the respective savepoints
 Key: FLINK-33944
 URL: https://issues.apache.org/jira/browse/FLINK-33944
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.18.0
Reporter: Vijay


 
We are using Flink (1.18) version for our Flink cluster. The job manager has 
been deployed in "Application mode" and we are looking for a process to restore 
multiple jobs (using their respective savepoint directories) when the job 
manager is started. Currently, we have the option to restore only one job while 
running "standalone-job.sh" using the --fromSavepoint and 
--allowNonRestoredState. However, we need a way to trigger multiple job 
executions via Java client.

Note: We are not using a Kubernetes native deployment, but we are using k8s 
standalone mode of deployment.

*Expected process:*
 # Before starting with the Flink/application image upgrade, trigger the 
savepoints for all the current running jobs.
 # Once the savepoints process completed for all jobs, will trigger the scale 
down of job manager and task manager instances.
 # Update the image version on the k8s deployment with the update application 
image.
 # After image version is updated, scale up the job manager and task manager.
 # We need a process to restore the previously running jobs from the savepoint 
dir and start all the jobs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated FLINK-33943:
--
Description: 
Hi Team,

*Note:* Not sure whether I have picked the right component while raising the 
issue.

Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
cluster. Job manager has been deployed on "Application mode" and when HA is 
disabled (high-availability.type: NONE) we are able to start multiple jobs 
(using env.executeAsyn()) for a single application. But when I setup the 
Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
seeing only one job is getting executed on the Flink dashboard. Following are 
the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
Please let us know if anyone has experienced similar issues and have any 
suggestions. Thanks in advance for your assistance.

*Note:* We are using a Streaming application and following are the 
flink-config.yaml configurations.

*Additional query:* Does "Session mode" of deployment support HA for multiple 
execute() executions?
 # high-availability.storageDir: /opt/flink/data
 # high-availability.cluster-id: test
 # high-availability.zookeeper.quorum: localhost:2181
 # high-availability.type: zookeeper
 # high-availability.zookeeper.path.root: /dp/configs/flinkha

  was:
Hi Team,

Note: Not sure whether I have picked the right component while raising the 
issue.

Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
cluster. Job manager has been deployed on "Application mode" and when HA is 
disabled (high-availability.type: NONE) we are able to start multiple jobs 
(using env.executeAsyn()) for a single application. But when I setup the 
Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
seeing only one job is getting executed on the Flink dashboard. Following are 
the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
Please let us know if anyone has experienced similar issues and have any 
suggestions. Thanks in advance for your assistance.

Note: We are using a Streaming application and following are the 
flink-config.yaml configurations.
 # high-availability.storageDir: /opt/flink/data
 # high-availability.cluster-id: test
 # high-availability.zookeeper.quorum: localhost:2181
 # high-availability.type: zookeeper
 # high-availability.zookeeper.path.root: /dp/configs/flinkha


> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> *Note:* Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> *Note:* We are using a Streaming application and following are the 
> flink-config.yaml configurations.
> *Additional query:* Does "Session mode" of deployment support HA for multiple 
> execute() executions?
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-33942][configuration] Fix the bug that DelegatingConfiguration misses the prefix in some get methods [flink]

2023-12-26 Thread via GitHub


RocMarshal commented on code in PR #23994:
URL: https://github.com/apache/flink/pull/23994#discussion_r1436698086


##
flink-core/src/test/java/org/apache/flink/configuration/DelegatingConfigurationTest.java:
##
@@ -136,15 +136,65 @@ public void 
testDelegationConfigurationToMapConsistentWithAddAllToProperties() {
 mapProperties.put(entry.getKey(), entry.getValue());
 }
 // Verification
-assertEquals(properties, mapProperties);
+assertThat(mapProperties).isEqualTo(properties);
 }
 
 @Test
-public void testSetReturnsDelegatingConfiguration() {
+void testSetReturnsDelegatingConfiguration() {
 final Configuration conf = new Configuration();
 final DelegatingConfiguration delegatingConf = new 
DelegatingConfiguration(conf, "prefix.");
 
 
Assertions.assertThat(delegatingConf.set(CoreOptions.DEFAULT_PARALLELISM, 1))
 .isSameAs(delegatingConf);
 }
+
+@Test
+void testGetWithOverrideDefault() {

Review Comment:
   I like the test case!  



##
flink-core/src/test/java/org/apache/flink/configuration/DelegatingConfigurationTest.java:
##
@@ -115,11 +115,11 @@ public void testDelegationConfigurationWithPrefix() {
 configuration = new DelegatingConfiguration(backingConf, prefix);
 keySet = configuration.keySet();
 
-assertTrue(keySet.isEmpty());
+assertThat(keySet).isEmpty();

Review Comment:
   how about   
   ```
   assertThat(configuration.keySet()).isEmpty();
   
   ```



##
flink-core/src/test/java/org/apache/flink/configuration/DelegatingConfigurationTest.java:
##
@@ -103,8 +103,8 @@ public void testDelegationConfigurationWithPrefix() {
 DelegatingConfiguration configuration = new 
DelegatingConfiguration(backingConf, prefix);
 Set keySet = configuration.keySet();
 
-assertEquals(keySet.size(), 1);
-assertEquals(keySet.iterator().next(), expectedKey);
+assertThat(keySet).hasSize(1);
+assertThat(expectedKey).isEqualTo(keySet.iterator().next());

Review Comment:
   how about :
   
   ```
   assertThat(configuration.keySet()).containsExactly(expectedKey);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800671#comment-17800671
 ] 

Lijie Wang commented on FLINK-33943:


[~vrang...@in.ibm.com] I think it should work if you use session mode instead 
of application mode.

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> Note: Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> Note: We are using a Streaming application and following are the 
> flink-config.yaml configurations.
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800670#comment-17800670
 ] 

Vijay edited comment on FLINK-33943 at 12/27/23 2:22 AM:
-

[~wanglijie] Is the HA in session mode support execution of multiple 
execute/executeAsync operations? Sorry I am unable to find any documentation 
related to HA on session mode and its features / limitations.


was (Author: JIRAUSER303619):
[~wanglijie] Is the HA in session mode support execution of multiple 
execute/executeAsync operations?

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> Note: Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> Note: We are using a Streaming application and following are the 
> flink-config.yaml configurations.
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800670#comment-17800670
 ] 

Vijay commented on FLINK-33943:
---

[~wanglijie] Is the HA in session mode support execution of multiple 
execute/executeAsync operations?

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> Note: Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> Note: We are using a Streaming application and following are the 
> flink-config.yaml configurations.
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800669#comment-17800669
 ] 

Vijay commented on FLINK-33943:
---

 
The issue can be reproduced by enabling high-availability.type: zookeeper (with 
above config's specified on the issue) and in the flink client code try to call 
env.executeAsync() for multiple instance of job for the same application. Now 
open Dashboard and check the number of jobs running(same can be tried via REST 
api call too), then you will find only one job running. When you disable HA 
(high-availability.type: NONE), then you can see multiple jobs running (same 
can be seen via REST api call too).

REST api: http://:8081/v1/jobs

> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> Note: Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> Note: We are using a Streaming application and following are the 
> flink-config.yaml configurations.
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800668#comment-17800668
 ] 

Lijie Wang commented on FLINK-33943:


Hi [~vrang...@in.ibm.com], currently, HA in Application Mode is only supported 
for single-execute() applications. You can find more details in [flink 
docs|https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/overview/#application-mode].


> Apache flink: Issues after configuring HA (using zookeeper setting)
> ---
>
> Key: FLINK-33943
> URL: https://issues.apache.org/jira/browse/FLINK-33943
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.18.0
> Environment: Flink version: 1.18
> Zookeeper version: 3.7.2
> Env: Custom flink docker image (with embedded application class) deployed 
> over kubernetes (v1.26.11).
>  
>Reporter: Vijay
>Priority: Major
>
> Hi Team,
> Note: Not sure whether I have picked the right component while raising the 
> issue.
> Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
> cluster. Job manager has been deployed on "Application mode" and when HA is 
> disabled (high-availability.type: NONE) we are able to start multiple jobs 
> (using env.executeAsyn()) for a single application. But when I setup the 
> Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
> seeing only one job is getting executed on the Flink dashboard. Following are 
> the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
> Please let us know if anyone has experienced similar issues and have any 
> suggestions. Thanks in advance for your assistance.
> Note: We are using a Streaming application and following are the 
> flink-config.yaml configurations.
>  # high-availability.storageDir: /opt/flink/data
>  # high-availability.cluster-id: test
>  # high-availability.zookeeper.quorum: localhost:2181
>  # high-availability.type: zookeeper
>  # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33943) Apache flink: Issues after configuring HA (using zookeeper setting)

2023-12-26 Thread Vijay (Jira)
Vijay created FLINK-33943:
-

 Summary: Apache flink: Issues after configuring HA (using 
zookeeper setting)
 Key: FLINK-33943
 URL: https://issues.apache.org/jira/browse/FLINK-33943
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.18.0
 Environment: Flink version: 1.18

Zookeeper version: 3.7.2

Env: Custom flink docker image (with embedded application class) deployed over 
kubernetes (v1.26.11).

 
Reporter: Vijay


Hi Team,

Note: Not sure whether I have picked the right component while raising the 
issue.

Good Day. I am using Flink (1.18) version and zookeeper (3.7.2) for our flink 
cluster. Job manager has been deployed on "Application mode" and when HA is 
disabled (high-availability.type: NONE) we are able to start multiple jobs 
(using env.executeAsyn()) for a single application. But when I setup the 
Zookeeper as the HA type (high-availability.type: zookeeper), we are only 
seeing only one job is getting executed on the Flink dashboard. Following are 
the parameters setup for the Zookeeper based HA setup on the flink-conf.yaml. 
Please let us know if anyone has experienced similar issues and have any 
suggestions. Thanks in advance for your assistance.

Note: We are using a Streaming application and following are the 
flink-config.yaml configurations.
 # high-availability.storageDir: /opt/flink/data
 # high-availability.cluster-id: test
 # high-availability.zookeeper.quorum: localhost:2181
 # high-availability.type: zookeeper
 # high-availability.zookeeper.path.root: /dp/configs/flinkha



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] Static import removed [flink]

2023-12-26 Thread via GitHub


flinkbot commented on PR #23995:
URL: https://github.com/apache/flink/pull/23995#issuecomment-1869708699

   
   ## CI report:
   
   * c6fd1da2aee1519c4dd2d41c2c8212185f4bbab2 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] Static import removed [flink]

2023-12-26 Thread via GitHub


Shrey-Viradiya opened a new pull request, #23995:
URL: https://github.com/apache/flink/pull/23995

   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (FLINK-32849) [JUnit5 Migration] The resourcemanager package of flink-runtime module

2023-12-26 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan resolved FLINK-32849.
-
Fix Version/s: 1.19.0
   Resolution: Fixed

> [JUnit5 Migration] The resourcemanager package of flink-runtime module
> --
>
> Key: FLINK-32849
> URL: https://issues.apache.org/jira/browse/FLINK-32849
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Rui Fan
>Assignee: RocMarshal
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32849) [JUnit5 Migration] The resourcemanager package of flink-runtime module

2023-12-26 Thread Rui Fan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800504#comment-17800504
 ] 

Rui Fan commented on FLINK-32849:
-

Merged to master(1.19) via : 11cdf7e7adacfe64d961a48844841a24b918257a

> [JUnit5 Migration] The resourcemanager package of flink-runtime module
> --
>
> Key: FLINK-32849
> URL: https://issues.apache.org/jira/browse/FLINK-32849
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Rui Fan
>Assignee: RocMarshal
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-32849][runtime][JUnit5 Migration] The resourcemanager package of flink-runtime module [flink]

2023-12-26 Thread via GitHub


1996fanrui merged PR #23973:
URL: https://github.com/apache/flink/pull/23973


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33386][runtime] Support tasks balancing at slot level for Default Scheduler [flink]

2023-12-26 Thread via GitHub


RocXing commented on code in PR #23635:
URL: https://github.com/apache/flink/pull/23635#discussion_r1436413745


##
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/TaskBalancedPreferredSlotSharingStrategy.java:
##
@@ -0,0 +1,344 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.scheduler;
+
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.jobmanager.scheduler.CoLocationConstraint;
+import org.apache.flink.runtime.jobmanager.scheduler.CoLocationGroup;
+import org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroup;
+import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID;
+import org.apache.flink.runtime.scheduler.strategy.SchedulingExecutionVertex;
+import org.apache.flink.runtime.scheduler.strategy.SchedulingTopology;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+/**
+ * This strategy tries to get a balanced tasks scheduling. Execution vertices, 
which are belong to
+ * the same SlotSharingGroup, tend to be put evenly in each 
ExecutionSlotSharingGroup. Co-location
+ * constraints will be respected.
+ */
+class TaskBalancedPreferredSlotSharingStrategy extends 
AbstractSlotSharingStrategy {
+
+public static final Logger LOG =
+
LoggerFactory.getLogger(TaskBalancedPreferredSlotSharingStrategy.class);
+
+TaskBalancedPreferredSlotSharingStrategy(
+final SchedulingTopology topology,
+final Set slotSharingGroups,
+final Set coLocationGroups) {
+super(topology, slotSharingGroups, coLocationGroups);
+}
+
+@Override
+protected Map 
computeExecutionSlotSharingGroups(
+SchedulingTopology schedulingTopology) {
+return new TaskBalancedExecutionSlotSharingGroupBuilder(
+schedulingTopology, this.logicalSlotSharingGroups, 
this.coLocationGroups)
+.build();
+}
+
+static class Factory implements SlotSharingStrategy.Factory {
+
+public TaskBalancedPreferredSlotSharingStrategy create(
+final SchedulingTopology topology,
+final Set slotSharingGroups,
+final Set coLocationGroups) {
+
+return new TaskBalancedPreferredSlotSharingStrategy(
+topology, slotSharingGroups, coLocationGroups);
+}
+}
+
+/** The interface to compute the fittest slot index. */
+private interface SlotIndexSupplier {
+
+int getFittestSlotIndex(
+final SlotSharingGroup slotSharingGroup,
+@Nullable final SchedulingExecutionVertex executionVertex);
+}
+
+/** SlotSharingGroupBuilder class for balanced scheduling strategy. */
+private static class TaskBalancedExecutionSlotSharingGroupBuilder {
+
+private final SchedulingTopology topology;
+
+private final Map slotSharingGroupMap;
+
+/** Record the {@link ExecutionSlotSharingGroup}s for {@link 
SlotSharingGroup}s. */
+private final Map>
+paralleledExecutionSlotSharingGroupsMap;
+
+/**
+ * Record the next round-robin {@link ExecutionSlotSharingGroup} index 
for {@link
+ * SlotSharingGroup}s.
+ */
+private final Map slotSharingGroupIndexMap;
+
+private final Map
+executionSlotSharingGroupMap;
+
+private final Map coLocationGroupMap;
+
+private final Map
+constraintToExecutionSlotSharingGroupMap;
+
+private TaskBalancedExecutionSlotSharingGroupBuilder(
+final SchedulingTopology topology,
+final Set slotSharingGroups,
+final Set coLocationGroups) {
+this.topology = checkNotNull(topology);
+
+

[jira] [Assigned] (FLINK-20281) Window aggregation supports changelog stream input

2023-12-26 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-20281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-20281:
---

Assignee: xuyang

> Window aggregation supports changelog stream input
> --
>
> Key: FLINK-20281
> URL: https://issues.apache.org/jira/browse/FLINK-20281
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner, Table SQL / Runtime
>Reporter: Jark Wu
>Assignee: xuyang
>Priority: Major
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
> Attachments: screenshot-1.png
>
>
> Currently, window aggregation doesn't support to consume a changelog stream. 
> This makes it impossible to do a window aggregation on changelog sources 
> (e.g. Kafka with Debezium format, or upsert-kafka, or mysql-cdc). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-33942][configuration] Fix the bug that DelegatingConfiguration misses the prefix in some get methods [flink]

2023-12-26 Thread via GitHub


flinkbot commented on PR #23994:
URL: https://github.com/apache/flink/pull/23994#issuecomment-1869458088

   
   ## CI report:
   
   * 19de85716cc7962c47836d5d11859c333d4d454b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33942) DelegatingConfiguration misses the perfix for some methods

2023-12-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-33942:
---
Labels: pull-request-available  (was: )

> DelegatingConfiguration misses the perfix for some methods
> --
>
> Key: FLINK-33942
> URL: https://issues.apache.org/jira/browse/FLINK-33942
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Configuration
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0, 1.17.3, 1.18.2
>
>
> DelegatingConfiguration misses the perfix for some methods, such as:
>  * 
> DelegatingConfiguration#getInteger(org.apache.flink.configuration.ConfigOption,
>  int)
>  * 
> DelegatingConfiguration#getLong(org.apache.flink.configuration.ConfigOption,
>  long)
>  * 
> org.apache.flink.configuration.DelegatingConfiguration#getBoolean(org.apache.flink.configuration.ConfigOption,
>  boolean)
>  * 
> org.apache.flink.configuration.DelegatingConfiguration#getFloat(org.apache.flink.configuration.ConfigOption,
>  float)
>  * 
> org.apache.flink.configuration.DelegatingConfiguration#getDouble(org.apache.flink.configuration.ConfigOption,
>  double)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [FLINK-33942][configuration] Fix the bug that DelegatingConfiguration misses the prefix in some get methods [flink]

2023-12-26 Thread via GitHub


1996fanrui opened a new pull request, #23994:
URL: https://github.com/apache/flink/pull/23994

   ## What is the purpose of the change
   
   DelegatingConfiguration misses the perfix for some methods, such as:
   
   - 
DelegatingConfiguration#getInteger(org.apache.flink.configuration.ConfigOption,
 int)
   - 
DelegatingConfiguration#getLong(org.apache.flink.configuration.ConfigOption,
 long)
   - 
DelegatingConfiguration#getBoolean(org.apache.flink.configuration.ConfigOption,
 boolean)
   - 
DelegatingConfiguration#getFloat(org.apache.flink.configuration.ConfigOption,
 float)
   - 
DelegatingConfiguration#getDouble(org.apache.flink.configuration.ConfigOption,
 double)
   
   ## Brief change log
   
   - [FLINK-33942][configuration][junit5-migration] Migrate 
DelegatingConfigurationTest to Junit5 and Assertj
   - [FLINK-33942][configuration] Fix the bug that DelegatingConfiguration 
misses the prefix in some get methods
   - [FLINK-33942][configuration][refactor] Using ConfigOption instead of 
string key in DelegatingConfiguration
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
 - *DelegatingConfigurationTest#testGetWithOverrideDefault*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature?no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-33942) DelegatingConfiguration misses the perfix for some methods

2023-12-26 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-33942:

Description: 
DelegatingConfiguration misses the perfix for some methods, such as:
 * 
DelegatingConfiguration#getInteger(org.apache.flink.configuration.ConfigOption,
 int)
 * 
DelegatingConfiguration#getLong(org.apache.flink.configuration.ConfigOption,
 long)
 * 
org.apache.flink.configuration.DelegatingConfiguration#getBoolean(org.apache.flink.configuration.ConfigOption,
 boolean)
 * 
org.apache.flink.configuration.DelegatingConfiguration#getFloat(org.apache.flink.configuration.ConfigOption,
 float)
 * 
org.apache.flink.configuration.DelegatingConfiguration#getDouble(org.apache.flink.configuration.ConfigOption,
 double)

  was:
* 
DelegatingConfiguration#getInteger(org.apache.flink.configuration.ConfigOption,
 int)
 * 
DelegatingConfiguration#getLong(org.apache.flink.configuration.ConfigOption,
 long)
 * 
org.apache.flink.configuration.DelegatingConfiguration#getBoolean(org.apache.flink.configuration.ConfigOption,
 boolean)
 * 
org.apache.flink.configuration.DelegatingConfiguration#getFloat(org.apache.flink.configuration.ConfigOption,
 float)
 * 
org.apache.flink.configuration.DelegatingConfiguration#getDouble(org.apache.flink.configuration.ConfigOption,
 double)


> DelegatingConfiguration misses the perfix for some methods
> --
>
> Key: FLINK-33942
> URL: https://issues.apache.org/jira/browse/FLINK-33942
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Configuration
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
> Fix For: 1.19.0, 1.17.3, 1.18.2
>
>
> DelegatingConfiguration misses the perfix for some methods, such as:
>  * 
> DelegatingConfiguration#getInteger(org.apache.flink.configuration.ConfigOption,
>  int)
>  * 
> DelegatingConfiguration#getLong(org.apache.flink.configuration.ConfigOption,
>  long)
>  * 
> org.apache.flink.configuration.DelegatingConfiguration#getBoolean(org.apache.flink.configuration.ConfigOption,
>  boolean)
>  * 
> org.apache.flink.configuration.DelegatingConfiguration#getFloat(org.apache.flink.configuration.ConfigOption,
>  float)
>  * 
> org.apache.flink.configuration.DelegatingConfiguration#getDouble(org.apache.flink.configuration.ConfigOption,
>  double)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-33930] Stateless stop job status exception [flink-kubernetes-operator]

2023-12-26 Thread via GitHub


hunter-cloud09 commented on PR #740:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/740#issuecomment-1869390002

   hi, @gyfora, I added test case. PTAL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-32849][runtime][JUnit5 Migration] The resourcemanager package of flink-runtime module [flink]

2023-12-26 Thread via GitHub


RocMarshal commented on PR #23973:
URL: https://github.com/apache/flink/pull/23973#issuecomment-1869379533

   Hi, Thanks so much for @1996fanrui your review and patience.
   
   I updated it based on your comments.
   And I triggered about 4 times ci-flink-bot by modifying commit(--amend). Its 
succeeded final.
   
   PTAL in your free time~  :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-33942) DelegatingConfiguration misses the perfix for some methods

2023-12-26 Thread Rui Fan (Jira)
Rui Fan created FLINK-33942:
---

 Summary: DelegatingConfiguration misses the perfix for some methods
 Key: FLINK-33942
 URL: https://issues.apache.org/jira/browse/FLINK-33942
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Configuration
Reporter: Rui Fan
Assignee: Rui Fan


* 
DelegatingConfiguration#getInteger(org.apache.flink.configuration.ConfigOption,
 int)
 * 
DelegatingConfiguration#getLong(org.apache.flink.configuration.ConfigOption,
 long)
 * 
org.apache.flink.configuration.DelegatingConfiguration#getBoolean(org.apache.flink.configuration.ConfigOption,
 boolean)
 * 
org.apache.flink.configuration.DelegatingConfiguration#getFloat(org.apache.flink.configuration.ConfigOption,
 float)
 * 
org.apache.flink.configuration.DelegatingConfiguration#getDouble(org.apache.flink.configuration.ConfigOption,
 double)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33942) DelegatingConfiguration misses the perfix for some methods

2023-12-26 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-33942:

Fix Version/s: 1.19.0
   1.17.3
   1.18.2

> DelegatingConfiguration misses the perfix for some methods
> --
>
> Key: FLINK-33942
> URL: https://issues.apache.org/jira/browse/FLINK-33942
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Configuration
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
> Fix For: 1.19.0, 1.17.3, 1.18.2
>
>
> * 
> DelegatingConfiguration#getInteger(org.apache.flink.configuration.ConfigOption,
>  int)
>  * 
> DelegatingConfiguration#getLong(org.apache.flink.configuration.ConfigOption,
>  long)
>  * 
> org.apache.flink.configuration.DelegatingConfiguration#getBoolean(org.apache.flink.configuration.ConfigOption,
>  boolean)
>  * 
> org.apache.flink.configuration.DelegatingConfiguration#getFloat(org.apache.flink.configuration.ConfigOption,
>  float)
>  * 
> org.apache.flink.configuration.DelegatingConfiguration#getDouble(org.apache.flink.configuration.ConfigOption,
>  double)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-33795] Add a new config to forbid autoscale execution in the configured excluded periods [flink-kubernetes-operator]

2023-12-26 Thread via GitHub


1996fanrui commented on code in PR #728:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/728#discussion_r1436336459


##
flink-autoscaler/src/main/java/org/apache/flink/autoscaler/config/AutoScalerOptions.java:
##
@@ -72,6 +72,21 @@ private static ConfigOptions.OptionBuilder 
autoScalerConfig(String key) {
 .withDescription(
 "Stabilization period in which no new scaling will 
be executed");
 
+public static final ConfigOption> EXCLUDED_PERIODS =
+autoScalerConfig("excluded.periods")
+.stringType()
+.asList()
+.defaultValues()

Review Comment:
   It's fine for me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33795] Add a new config to forbid autoscale execution in the configured excluded periods [flink-kubernetes-operator]

2023-12-26 Thread via GitHub


flashJd commented on code in PR #728:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/728#discussion_r143629


##
flink-autoscaler/src/main/java/org/apache/flink/autoscaler/config/AutoScalerOptions.java:
##
@@ -72,6 +72,21 @@ private static ConfigOptions.OptionBuilder 
autoScalerConfig(String key) {
 .withDescription(
 "Stabilization period in which no new scaling will 
be executed");
 
+public static final ConfigOption> EXCLUDED_PERIODS =
+autoScalerConfig("excluded.periods")
+.stringType()
+.asList()
+.defaultValues()

Review Comment:
   @1996fanrui change the defaultValue to `noDefaultValue` causes NPE, we have 
to add many annoying
notNull judgments, as config `vertex.exclude.ids` also set to 
defaultValues(), I referred to it, what do you think.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-33940) Update the auto-derivation rule of max parallelism for enlarged upscaling space

2023-12-26 Thread Zhanghao Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800444#comment-17800444
 ] 

Zhanghao Chen commented on FLINK-33940:
---

Thanks [~gyfora] for pointing that. We also observe some data skew introduced 
by imbalanced key group assignment in our production use, using highly 
composite numbers should be useful. How about using the following formula: 
{{{}min(max({*}roundUpToHighlyCompositeNum{*}(operatorParallelism * {*}5{*}), 
{*}840{*}), {*}45360{*}){}}}, where the highly composite number series is 
defined in [A002182 - OEIS|https://oeis.org/A002182]?

> Update the auto-derivation rule of max parallelism for enlarged upscaling 
> space
> ---
>
> Key: FLINK-33940
> URL: https://issues.apache.org/jira/browse/FLINK-33940
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Reporter: Zhanghao Chen
>Assignee: Zhanghao Chen
>Priority: Major
>
> *Background*
> The choice of the max parallelism of an stateful operator is important as it 
> limits the upper bound of the parallelism of the opeartor while it can also 
> add extra overhead when being set too large. Currently, the max parallelism 
> of an opeartor is either fixed to a value specified by API core / pipeline 
> option or auto-derived with the following rules:
> {{min(max(roundUpToPowerOfTwo(operatorParallelism * 1.5), 128), 32767)}}
> *Problem*
> Recently, the elasticity of Flink jobs is becoming more and more valued by 
> users. The current auto-derived max parallelism was introduced a time time 
> ago and only allows the operator parallelism to be roughly doubled, which is 
> not desired for elasticity. Setting an max parallelism manually may not be 
> desired as well: users may not have the sufficient expertise to select a good 
> max-parallelism value.
> *Proposal*
> Update the auto-derivation rule of max parallelism to derive larger max 
> parallelism for better elasticity experience out of the box. A candidate is 
> as follows:
> {{min(max(roundUpToPowerOfTwo(operatorParallelism * {*}5{*}), {*}1024{*}), 
> 32767)}}
> Looking forward to your opinions on this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-20281) Window aggregation supports changelog stream input

2023-12-26 Thread xuyang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800443#comment-17800443
 ] 

xuyang commented on FLINK-20281:


Can I take this Jira?

> Window aggregation supports changelog stream input
> --
>
> Key: FLINK-20281
> URL: https://issues.apache.org/jira/browse/FLINK-20281
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner, Table SQL / Runtime
>Reporter: Jark Wu
>Priority: Major
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
> Attachments: screenshot-1.png
>
>
> Currently, window aggregation doesn't support to consume a changelog stream. 
> This makes it impossible to do a window aggregation on changelog sources 
> (e.g. Kafka with Debezium format, or upsert-kafka, or mysql-cdc). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33938) Correct implicit coercions in relational operators to adopt typescript 5.0

2023-12-26 Thread Ao Yuchen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800439#comment-17800439
 ] 

Ao Yuchen commented on FLINK-33938:
---

Would you like to take a look at this ticket? [~simplejason] 

> Correct implicit coercions in relational operators to adopt typescript 5.0
> --
>
> Key: FLINK-33938
> URL: https://issues.apache.org/jira/browse/FLINK-33938
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Ao Yuchen
>Priority: Major
> Fix For: 1.19.0, 1.17.3, 1.18.2
>
>
> Since TypeScript 5.0, there is a break change that implicit coercions in 
> relational operators are forbidden [1].
> So that the following code in 
> flink-runtime-web/web-dashboard/src/app/components/humanize-date.pipe.ts get 
> error:
> {code:java}
> public transform(
>   value: number | string | Date,
>   ...
> ): string | null | undefined {
>   if (value == null || value === '' || value !== value || value < 0) {
> return '-';
>   } 
>   ...
> }{code}
> The correctness improvement is availble in here 
> [2][.|https://github.com/microsoft/TypeScript/pull/52048.]
> I think we should optimize this type of code for better compatibility.
>  
> [1] 
> [https://devblogs.microsoft.com/typescript/announcing-typescript-5-0/#forbidden-implicit-coercions-in-relational-operators]
> [2] 
> [https://github.com/microsoft/TypeScript/pull/52048|https://github.com/microsoft/TypeScript/pull/52048.]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)