[GitHub] [flink] felixzh2020 commented on pull request #22093: [FLINK-31321] Yarn-session mode, securityConfiguration supports dynam…

2023-03-03 Thread via GitHub


felixzh2020 commented on PR #22093:
URL: https://github.com/apache/flink/pull/22093#issuecomment-1454657716

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-31323) Fix unstable merge-into E2E test

2023-03-03 Thread yuzelin (Jira)
yuzelin created FLINK-31323:
---

 Summary: Fix unstable merge-into E2E test
 Key: FLINK-31323
 URL: https://issues.apache.org/jira/browse/FLINK-31323
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Affects Versions: table-store-0.4.0
Reporter: yuzelin


A complex test of merge-into action in docker environment may fail. So the test 
need to be simplified.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31322) Improve merge-into action

2023-03-03 Thread yuzelin (Jira)
yuzelin created FLINK-31322:
---

 Summary: Improve merge-into action
 Key: FLINK-31322
 URL: https://issues.apache.org/jira/browse/FLINK-31322
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Affects Versions: table-store-0.4.0
Reporter: yuzelin


Umbrella issue for bug fixing and code improvement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] XComp commented on pull request #21736: [FLINK-27518][tests] Refactor migration tests to support version update automatically

2023-03-03 Thread via GitHub


XComp commented on PR #21736:
URL: https://github.com/apache/flink/pull/21736#issuecomment-1454533512

   Not sure, whether I understand you in the right way here: I didn't mean to 
push the generated test da in multiple commits (e.g. one commit per test 
class). I meant that we want to prepare this PR to have one commit for the 
refactoring of the test data generation and one commit for the generated data. 
Does that make sense?🤔
   
   But on the other note: I reiterated on your proposal for how the process 
should look like. I came to the conclusion that we shouldn't create the test 
data when creating the release branch. We still have to do it after the new 
minor release (in our current case 1.17.0) is published. The test data should 
be generated using the code version of the minor version's git tag (i.e. 
`release-1.17.0`). That's the baseline for migration tests. Therefore, it must 
be possible to control the `FlinkVersion.getMostRecentlyPublished()`'s return 
value. Automatically deriving it from the enum taking the element before the 
last element is not good enough . Do you agree?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #22093: [FLINK-31321] Yarn-session mode, securityConfiguration supports dynam…

2023-03-03 Thread via GitHub


flinkbot commented on PR #22093:
URL: https://github.com/apache/flink/pull/22093#issuecomment-1454510036

   
   ## CI report:
   
   * e7c1a87f3ea4bc93612810e695a15bd2478b181d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31321) Yarn-session mode, securityConfiguration supports dynamic configuration

2023-03-03 Thread felixzh (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

felixzh updated FLINK-31321:

Component/s: Deployment / YARN

> Yarn-session mode, securityConfiguration supports dynamic configuration
> ---
>
> Key: FLINK-31321
> URL: https://issues.apache.org/jira/browse/FLINK-31321
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Reporter: felixzh
>Priority: Major
>  Labels: pull-request-available
>
> when different tenants submit jobs using the same {_}flink-conf.yaml{_}, the 
> same user is displayed on the Yarn page.
> _SecurityConfiguration_ does not support dynamic configuration. Therefore, 
> the user displayed on the Yarn page is the 
> _security.kerberos.login.principal_ in the {_}flink-conf.yaml{_}.
> FLINK-29435 only fixed CliFrontend class(Corresponds to flink script).
> FlinkYarnSessionCli class(Corresponds to yarn-session.sh script) still exists 
> this question.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31321) Yarn-session mode, securityConfiguration supports dynamic configuration

2023-03-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-31321:
---
Labels: pull-request-available  (was: )

> Yarn-session mode, securityConfiguration supports dynamic configuration
> ---
>
> Key: FLINK-31321
> URL: https://issues.apache.org/jira/browse/FLINK-31321
> Project: Flink
>  Issue Type: Improvement
>Reporter: felixzh
>Priority: Major
>  Labels: pull-request-available
>
> when different tenants submit jobs using the same {_}flink-conf.yaml{_}, the 
> same user is displayed on the Yarn page.
> _SecurityConfiguration_ does not support dynamic configuration. Therefore, 
> the user displayed on the Yarn page is the 
> _security.kerberos.login.principal_ in the {_}flink-conf.yaml{_}.
> FLINK-29435 only fixed CliFrontend class(Corresponds to flink script).
> FlinkYarnSessionCli class(Corresponds to yarn-session.sh script) still exists 
> this question.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] felixzh2020 opened a new pull request, #22093: [FLINK-31321] Yarn-session mode, securityConfiguration supports dynam…

2023-03-03 Thread via GitHub


felixzh2020 opened a new pull request, #22093:
URL: https://github.com/apache/flink/pull/22093

   …ic configuration
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31321) Yarn-session mode, securityConfiguration supports dynamic configuration

2023-03-03 Thread felixzh (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

felixzh updated FLINK-31321:

Description: 
when different tenants submit jobs using the same {_}flink-conf.yaml{_}, the 
same user is displayed on the Yarn page.
_SecurityConfiguration_ does not support dynamic configuration. Therefore, the 
user displayed on the Yarn page is the _security.kerberos.login.principal_ in 
the {_}flink-conf.yaml{_}.

FLINK-29435 only fixed CliFrontend class(Corresponds to flink script).

FlinkYarnSessionCli class(Corresponds to yarn-session.sh script) still exists 
this question.

> Yarn-session mode, securityConfiguration supports dynamic configuration
> ---
>
> Key: FLINK-31321
> URL: https://issues.apache.org/jira/browse/FLINK-31321
> Project: Flink
>  Issue Type: Improvement
>Reporter: felixzh
>Priority: Major
>
> when different tenants submit jobs using the same {_}flink-conf.yaml{_}, the 
> same user is displayed on the Yarn page.
> _SecurityConfiguration_ does not support dynamic configuration. Therefore, 
> the user displayed on the Yarn page is the 
> _security.kerberos.login.principal_ in the {_}flink-conf.yaml{_}.
> FLINK-29435 only fixed CliFrontend class(Corresponds to flink script).
> FlinkYarnSessionCli class(Corresponds to yarn-session.sh script) still exists 
> this question.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] lindong28 commented on pull request #21557: [FLINK-30501] Update Flink build instruction to deprecate Java 8 instead of requiring Java 11

2023-03-03 Thread via GitHub


lindong28 commented on PR #21557:
URL: https://github.com/apache/flink/pull/21557#issuecomment-1454452696

   @reswqa Can you help review this PR? Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] lindong28 commented on pull request #22092: [FLINK-30501] Update Flink build instruction to deprecate Java 8 instead of requiring Java 11

2023-03-03 Thread via GitHub


lindong28 commented on PR #22092:
URL: https://github.com/apache/flink/pull/22092#issuecomment-1454449949

   @reswqa One moment. Let me update that PR based on the comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] lindong28 commented on pull request #22092: [FLINK-30501] Update Flink build instruction to deprecate Java 8 instead of requiring Java 11

2023-03-03 Thread via GitHub


lindong28 commented on PR #22092:
URL: https://github.com/apache/flink/pull/22092#issuecomment-1454447110

   @reswqa Yes, the PR for the master branch can be found at 
https://github.com/apache/flink/pull/21557. Can you also help review that PR? 
Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] reswqa commented on pull request #22092: [FLINK-30501] Update Flink build instruction to deprecate Java 8 instead of requiring Java 11

2023-03-03 Thread via GitHub


reswqa commented on PR #22092:
URL: https://github.com/apache/flink/pull/22092#issuecomment-1454445020

   A small question: do we need to pick the changes to the master branch? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] lindong28 merged pull request #22092: [FLINK-30501] Update Flink build instruction to deprecate Java 8 instead of requiring Java 11

2023-03-03 Thread via GitHub


lindong28 merged PR #22092:
URL: https://github.com/apache/flink/pull/22092


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] lindong28 commented on pull request #22092: [FLINK-30501] Update Flink build instruction to deprecate Java 8 instead of requiring Java 11

2023-03-03 Thread via GitHub


lindong28 commented on PR #22092:
URL: https://github.com/apache/flink/pull/22092#issuecomment-1454431262

   Thank you @reswqa for the review. I have updated the PR as suggested. Can 
you take another look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] reswqa commented on a diff in pull request #22092: [FLINK-30501] Update Flink build instruction to deprecate Java 8 instead of requiring Java 11

2023-03-03 Thread via GitHub


reswqa commented on code in PR #22092:
URL: https://github.com/apache/flink/pull/22092#discussion_r1125325957


##
docs/content/docs/dev/configuration/gradle.md:
##
@@ -31,7 +31,7 @@ to automate tasks in the development process.
 ## Requirements
 
 - Gradle 7.x 
-- Java 11
+- Java 8 (deprecated) or Java 11

Review Comment:
   Do we also need to modify the Chinese document of `gradle.md`.



##
docs/content.zh/docs/dev/configuration/maven.md:
##
@@ -29,7 +29,7 @@ under the License.
 ## 要求
 
 - Maven 3.0.4 (or higher)
-- Java 11
+- Java 8 (deprecated) or Java 11

Review Comment:
   Do we also need to modify the English document of `maven.md`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] reswqa commented on a diff in pull request #22092: [FLINK-30501] Update Flink build instruction to deprecate Java 8 instead of requiring Java 11

2023-03-03 Thread via GitHub


reswqa commented on code in PR #22092:
URL: https://github.com/apache/flink/pull/22092#discussion_r1125325957


##
docs/content/docs/dev/configuration/gradle.md:
##
@@ -31,7 +31,7 @@ to automate tasks in the development process.
 ## Requirements
 
 - Gradle 7.x 
-- Java 11
+- Java 8 (deprecated) or Java 11

Review Comment:
   Do we also need to modify the chinese document of `gradle.md`.



##
docs/content.zh/docs/dev/configuration/maven.md:
##
@@ -29,7 +29,7 @@ under the License.
 ## 要求
 
 - Maven 3.0.4 (or higher)
-- Java 11
+- Java 8 (deprecated) or Java 11

Review Comment:
   Do we also need to modify the english document of `maven.md`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] reswqa commented on a diff in pull request #22092: [FLINK-30501] Update Flink build instruction to deprecate Java 8 instead of requiring Java 11

2023-03-03 Thread via GitHub


reswqa commented on code in PR #22092:
URL: https://github.com/apache/flink/pull/22092#discussion_r1125325957


##
docs/content/docs/dev/configuration/gradle.md:
##
@@ -31,7 +31,7 @@ to automate tasks in the development process.
 ## Requirements
 
 - Gradle 7.x 
-- Java 11
+- Java 8 (deprecated) or Java 11

Review Comment:
   Do we also need to modify the chinese document part of `gradle.md`.



##
docs/content.zh/docs/dev/configuration/maven.md:
##
@@ -29,7 +29,7 @@ under the License.
 ## 要求
 
 - Maven 3.0.4 (or higher)
-- Java 11
+- Java 8 (deprecated) or Java 11

Review Comment:
   Do we also need to modify the english document part of `maven.md`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-30501) Update Flink build instruction to deprecate Java 8 instead of requiring Java 11

2023-03-03 Thread Dong Lin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17695955#comment-17695955
 ] 

Dong Lin edited comment on FLINK-30501 at 3/4/23 3:51 AM:
--

Note that the latest [Spark version 
3.3.2|https://spark.apache.org/docs/latest/building-spark.html] still supports 
Java 8. And the [latest Kafka version 
3.3.x|https://kafka.apache.org/documentation/] also supports Java 8. Both 
projects are widely used and they explicitly list the supported Java versions 
on their official doc website.

And it is explicitly mentioned on the Kafka website that "Java 8, Java 11, and 
Java 17 are supported. Note that Java 8 support has been deprecated since 
Apache Kafka 3.0 and will be removed in Apache Kafka 4.0".

Maybe we should follow their approach regarding whether to specify Java 8 
support and how to encourage users to use Java 11.


was (Author: lindong):
Note that the latest [Spark version 
3.3.2|https://spark.apache.org/docs/latest/building-spark.html] still supports 
Java 8. And the [latest Kafka version 3.3.x|http://example.com] also supports 
Java 8. Both projects are widely used and they explicitly list the supported 
Java versions on their official doc website.

And it is explicitly mentioned on the Kafka website that "Java 8, Java 11, and 
Java 17 are supported. Note that Java 8 support has been deprecated since 
Apache Kafka 3.0 and will be removed in Apache Kafka 4.0".

Maybe we should follow their approach regarding whether to specify Java 8 
support and how to encourage users to use Java 11.

> Update Flink build instruction to deprecate Java 8 instead of requiring Java 
> 11
> ---
>
> Key: FLINK-30501
> URL: https://issues.apache.org/jira/browse/FLINK-30501
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System / CI
>Reporter: Dong Lin
>Assignee: Dong Lin
>Priority: Major
>  Labels: pull-request-available
>
> Flink 1.15 and later versions require at least Java 11 to build from sources 
> [1], whereas the pom.xml specifies the source/target is 1.8. This 
> inconsistency confuses users.
> As mentioned in the FLINK-25247 title, the goal of that ticket is to "Inform 
> users about deprecation". It will be better to inform users that "Java 8 is 
> deprecated" instead of saying "Fink requires at least Java 11 to build", so 
> that users have the right information to make the right choice for themselves.
> Also note that Flink community is regularly running flink-ml benchmark for 
> both Java 8 and Java 11 [2], which suggests that we are practically ensuring 
> Java 8 is supported.
> [1] 
> [https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/flinkdev/building/]
> [2] 
> [http://codespeed.dak8s.net:8000/timeline/?ben=mapSink.F27_UNBOUNDED&env=2]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31321) Yarn-session mode, securityConfiguration supports dynamic configuration

2023-03-03 Thread felixzh (Jira)
felixzh created FLINK-31321:
---

 Summary: Yarn-session mode, securityConfiguration supports dynamic 
configuration
 Key: FLINK-31321
 URL: https://issues.apache.org/jira/browse/FLINK-31321
 Project: Flink
  Issue Type: Improvement
Reporter: felixzh






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #22092: [FLINK-30501] Update Flink build instruction to deprecate Java 8 instead of requiring Java 11

2023-03-03 Thread via GitHub


flinkbot commented on PR #22092:
URL: https://github.com/apache/flink/pull/22092#issuecomment-1454361495

   
   ## CI report:
   
   * fd81ec2b0269b0219f2173fec7ae3e11e4e0b38c UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] lindong28 opened a new pull request, #22092: [FLINK-30501] Update Flink build instruction to deprecate Java 8 instead of requiring Java 11

2023-03-03 Thread via GitHub


lindong28 opened a new pull request, #22092:
URL: https://github.com/apache/flink/pull/22092

   ## What is the purpose of the change
   
   Flink 1.15 and later versions require at least Java 11 to build from sources 
[1], whereas the pom.xml specifies the source/target is 1.8. This inconsistency 
confuses users.
   
   As mentioned in the 
[FLINK-25247](https://issues.apache.org/jira/browse/FLINK-25247) title, the 
goal of that ticket is to "Inform users about deprecation". It will be better 
to inform users that "Java 8 is deprecated" instead of saying "Fink requires at 
least Java 11 to build", so that users have the right information to make the 
right choice for themselves.
   
   Also note that Flink community is regularly running flink-ml benchmark for 
both Java 8 and Java 11 [2], which suggests that we are practically ensuring 
Java 8 is supported.
   
   [1] 
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/flinkdev/building/
   [2] http://codespeed.dak8s.net:8000/timeline/?ben=mapSink.F27_UNBOUNDED&env=2
   
   
   ## Brief change log
   
   Update Flink doc to mention that "Java 8 is deprecated" instead of saying 
"Flink requires at least Java 11 to build".
   
   ## Verifying this change
   
   N/A.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector:  no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? N/A
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-connector-opensearch] lilyevsky commented on a diff in pull request #11: [FLINK-30998] Add optional exception handler to flink-connector-opensearch

2023-03-03 Thread via GitHub


lilyevsky commented on code in PR #11:
URL: 
https://github.com/apache/flink-connector-opensearch/pull/11#discussion_r1125187664


##
flink-connector-opensearch/src/main/java/org/apache/flink/connector/opensearch/sink/OpensearchWriter.java:
##
@@ -122,10 +128,11 @@
 } catch (Exception e) {
 throw new FlinkRuntimeException("Failed to open the 
OpensearchEmitter", e);
 }
+this.failureHandler = failureHandler;
 }
 
 @Override
-public void write(IN element, Context context) throws IOException, 
InterruptedException {
+public void write(IN element, Context context) throws InterruptedException 
{

Review Comment:
   @reta No problem, could you please clarify: you want me to put back the 
IOException to both write and flush methods, correct? Also, I am not sure what 
you mean by addressing  
[this](https://github.com/apache/flink-connector-opensearch/pull/11/files#r1110066551)
 . Is it about removing the "@SuppressWarnings("All")"?
   Please confirm.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] mohsenrezaeithe commented on a diff in pull request #21849: [FLINK-30596][Runtime/REST] Fix duplicate jobs when submitting with the same jobId

2023-03-03 Thread via GitHub


mohsenrezaeithe commented on code in PR #21849:
URL: https://github.com/apache/flink/pull/21849#discussion_r1125140028


##
flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java:
##
@@ -1394,13 +1394,14 @@ private CompletableFuture waitForTerminatingJob(
 throwable));
 });
 
-return jobManagerTerminationFuture.thenAcceptAsync(
+return FutureUtils.thenAcceptAsyncIfNotDone(

Review Comment:
   Thanks for the suggestion @huwh.
   
   I added a new `submittedAndWaitingTerminationJobs` set for the outstanding 
`JobID`s, however, I wasn't sure if the lifecycle of the items in there need to 
be following the lifecycle of items in `jobManagerRunnerTerminationFutures` in 
any way.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] mohsenrezaeithe commented on a diff in pull request #21849: [FLINK-30596][Runtime/REST] Fix duplicate jobs when submitting with the same jobId

2023-03-03 Thread via GitHub


mohsenrezaeithe commented on code in PR #21849:
URL: https://github.com/apache/flink/pull/21849#discussion_r1125140028


##
flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java:
##
@@ -1394,13 +1394,14 @@ private CompletableFuture waitForTerminatingJob(
 throwable));
 });
 
-return jobManagerTerminationFuture.thenAcceptAsync(
+return FutureUtils.thenAcceptAsyncIfNotDone(

Review Comment:
   Thanks for the suggestion @huwh.
   
   I added a new `submittedAndWaitingTerminationJobs` set for the outstanding 
jobIds, however, I wasn't sure if the lifecycle of the items in there need to 
be following the lifecycle of items in `jobManagerRunnerTerminationFutures` in 
any way.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] (FLINK-30274) Upgrade commons-collections 3.x to commons-collections4

2023-03-03 Thread Ran Tao (Jira)


[ https://issues.apache.org/jira/browse/FLINK-30274 ]


Ran Tao deleted comment on FLINK-30274:
-

was (Author: lemonjing):
[~martijnvisser] Hi, Martijn I have updated the pr. 

> Upgrade commons-collections 3.x to commons-collections4
> ---
>
> Key: FLINK-30274
> URL: https://issues.apache.org/jira/browse/FLINK-30274
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Build System
>Affects Versions: 1.16.0
>Reporter: Ran Tao
>Assignee: Ran Tao
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-12-02-16-40-22-172.png
>
>
> First, Apache commons-collections 3.x is a Java 1.3 compatible version, and 
> it does not use Java 5 generics. Apache commons-collections4 4.4 is an 
> upgraded version of commons-collections and it built by Java 8.
> The Apache Spark has same issue: [https://github.com/apache/spark/pull/35257]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (FLINK-30274) Upgrade commons-collections 3.x to commons-collections4

2023-03-03 Thread Ran Tao (Jira)


[ https://issues.apache.org/jira/browse/FLINK-30274 ]


Ran Tao deleted comment on FLINK-30274:
-

was (Author: lemonjing):
[~martijnvisser] Hi, Can u help me to review it again?

> Upgrade commons-collections 3.x to commons-collections4
> ---
>
> Key: FLINK-30274
> URL: https://issues.apache.org/jira/browse/FLINK-30274
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Build System
>Affects Versions: 1.16.0
>Reporter: Ran Tao
>Assignee: Ran Tao
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-12-02-16-40-22-172.png
>
>
> First, Apache commons-collections 3.x is a Java 1.3 compatible version, and 
> it does not use Java 5 generics. Apache commons-collections4 4.4 is an 
> upgraded version of commons-collections and it built by Java 8.
> The Apache Spark has same issue: [https://github.com/apache/spark/pull/35257]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31319) Inconsistent condition judgement about kafka partitionDiscoveryIntervalMs cause potential bug

2023-03-03 Thread Ran Tao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696308#comment-17696308
 ] 

Ran Tao commented on FLINK-31319:
-

Reproduce:  Using bounded kafka, set partitionDiscoveryIntervalMs=0, then job 
never quit. [~martijnvisser] [~renqs]  WDYT? it's maybe a big bug.

> Inconsistent condition judgement about kafka partitionDiscoveryIntervalMs 
> cause potential bug
> -
>
> Key: FLINK-31319
> URL: https://issues.apache.org/jira/browse/FLINK-31319
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.16.1
>Reporter: Ran Tao
>Priority: Critical
>  Labels: pull-request-available
> Attachments: image-2023-03-04-01-37-29-360.png, 
> image-2023-03-04-01-39-20-352.png, image-2023-03-04-01-40-44-124.png, 
> image-2023-03-04-01-41-55-664.png
>
>
> As kafka option description, partitionDiscoveryIntervalMs <=0 means disabled. 
>  !image-2023-03-04-01-37-29-360.png|width=781,height=147!
> just like start kafka enumerator:
> !image-2023-03-04-01-39-20-352.png|width=465,height=311!
> but inner 
> handlePartitionSplitChanges use error if condition( < 0):
> !image-2023-03-04-01-40-44-124.png|width=576,height=237!
>  
> it will cause noMoreNewPartitionSplits can not be set to true. 
> !image-2023-03-04-01-41-55-664.png|width=522,height=610!
> Finally cause bounded source can not signalNoMoreSplits and quit.
> Besides,Both ends of the if condition should be mutually exclusive.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-31133) PartiallyFinishedSourcesITCase hangs if a checkpoint fails

2023-03-03 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan resolved FLINK-31133.
---
Resolution: Fixed

Fix merged as 

1.15 51660f840cfc505b9b9cb72530fde7f9f8a4dee2
1.16 cf04b2c08fa04091845bd310990497129c3bcbe8
1.17 6e7703738cdefed17277ea86d2c9dc25393eceac
master 4aacff572a9e3996c5dee9273638831e4040c767

> PartiallyFinishedSourcesITCase hangs if a checkpoint fails
> --
>
> Key: FLINK-31133
> URL: https://issues.apache.org/jira/browse/FLINK-31133
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0, 1.17.1
>Reporter: Matthias Pohl
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.2, 1.18.0, 1.17.1, 1.15.5
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46299&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=0c010d0c-3dec-5bf1-d408-7b18988b1b2b
> This build ran into a timeout. Based on the stacktraces reported, it was 
> either caused by 
> [SnapshotMigrationTestBase.restoreAndExecute|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46299&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=0c010d0c-3dec-5bf1-d408-7b18988b1b2b&l=13475]:
> {code}
> "main" #1 prio=5 os_prio=0 tid=0x7f23d800b800 nid=0x60cdd waiting on 
> condition [0x7f23e1c0d000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.flink.test.checkpointing.utils.SnapshotMigrationTestBase.restoreAndExecute(SnapshotMigrationTestBase.java:382)
>   at 
> org.apache.flink.test.migration.TypeSerializerSnapshotMigrationITCase.testSnapshot(TypeSerializerSnapshotMigrationITCase.java:172)
>   at sun.reflect.GeneratedMethodAccessor47.invoke(Unknown Source)
> [...]
> {code}
> or 
> [PartiallyFinishedSourcesITCase.test|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46299&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=0c010d0c-3dec-5bf1-d408-7b18988b1b2b&l=10401]:
> {code}
> 2023-02-20T07:13:05.6084711Z "main" #1 prio=5 os_prio=0 
> tid=0x7fd35c00b800 nid=0x8c8a waiting on condition [0x7fd363d0f000]
> 2023-02-20T07:13:05.6085149Zjava.lang.Thread.State: TIMED_WAITING 
> (sleeping)
> 2023-02-20T07:13:05.6085487Z  at java.lang.Thread.sleep(Native Method)
> 2023-02-20T07:13:05.6085925Z  at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:145)
> 2023-02-20T07:13:05.6086512Z  at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:138)
> 2023-02-20T07:13:05.6087103Z  at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitForSubtasksToFinish(CommonTestUtils.java:291)
> 2023-02-20T07:13:05.6087730Z  at 
> org.apache.flink.runtime.operators.lifecycle.TestJobExecutor.waitForSubtasksToFinish(TestJobExecutor.java:226)
> 2023-02-20T07:13:05.6088410Z  at 
> org.apache.flink.runtime.operators.lifecycle.PartiallyFinishedSourcesITCase.test(PartiallyFinishedSourcesITCase.java:138)
> 2023-02-20T07:13:05.6088957Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [...]
> {code}
> Still, it sounds odd: Based on a code analysis it's quite unlikely that those 
> two caused the issue. The former one has a 5 min timeout (see related code in 
> [SnapshotMigrationTestBase:382|https://github.com/apache/flink/blob/release-1.15/flink-tests/src/test/java/org/apache/flink/test/checkpointing/utils/SnapshotMigrationTestBase.java#L382]).
>  For the other one, we found it being not responsible in the past when some 
> other concurrent test caused the issue (see FLINK-30261).
> An investigation on where we lose the time for the timeout revealed that 
> {{AdaptiveSchedulerITCase}} took 2980s to finish (see [build 
> logs|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46299&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=0c010d0c-3dec-5bf1-d408-7b18988b1b2b&l=5265]).
> {code}
> 2023-02-20T03:43:55.4546050Z Feb 20 03:43:55 [ERROR] Picked up 
> JAVA_TOOL_OPTIONS: -XX:+HeapDumpOnOutOfMemoryError
> 2023-02-20T03:43:58.0448506Z Feb 20 03:43:58 [INFO] Running 
> org.apache.flink.test.scheduling.AdaptiveSchedulerITCase
> 2023-02-20T04:33:38.6824634Z Feb 20 04:33:38 [INFO] Tests run: 6, Failures: 
> 0, Errors: 0, Skipped: 0, Time elapsed: 2,980.445 s - in 
> org.apache.flink.test.scheduling.AdaptiveSchedulerITCase
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-27169) PartiallyFinishedSourcesITCase.test hangs on azure

2023-03-03 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan updated FLINK-27169:
--
Affects Version/s: 1.15.3

> PartiallyFinishedSourcesITCase.test hangs on azure
> --
>
> Key: FLINK-27169
> URL: https://issues.apache.org/jira/browse/FLINK-27169
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.0, 1.15.3
>Reporter: Yun Gao
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0, 1.15.5
>
>
> {code:java}
> Apr 10 08:32:18 "main" #1 prio=5 os_prio=0 tid=0x7f553400b800 nid=0x8345 
> waiting on condition [0x7f553be6]
> Apr 10 08:32:18java.lang.Thread.State: TIMED_WAITING (sleeping)
> Apr 10 08:32:18   at java.lang.Thread.sleep(Native Method)
> Apr 10 08:32:18   at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:145)
> Apr 10 08:32:18   at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:138)
> Apr 10 08:32:18   at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitForSubtasksToFinish(CommonTestUtils.java:291)
> Apr 10 08:32:18   at 
> org.apache.flink.runtime.operators.lifecycle.TestJobExecutor.waitForSubtasksToFinish(TestJobExecutor.java:226)
> Apr 10 08:32:18   at 
> org.apache.flink.runtime.operators.lifecycle.PartiallyFinishedSourcesITCase.test(PartiallyFinishedSourcesITCase.java:138)
> Apr 10 08:32:18   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Apr 10 08:32:18   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Apr 10 08:32:18   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Apr 10 08:32:18   at java.lang.reflect.Method.invoke(Method.java:498)
> Apr 10 08:32:18   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> Apr 10 08:32:18   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Apr 10 08:32:18   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> Apr 10 08:32:18   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Apr 10 08:32:18   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> Apr 10 08:32:18   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> Apr 10 08:32:18   at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> Apr 10 08:32:18   at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> Apr 10 08:32:18   at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> Apr 10 08:32:18   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> Apr 10 08:32:18   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> Apr 10 08:32:18   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> Apr 10 08:32:18   at org.junit.runners.Suite.runChild(Suite.java:128)
> Apr 10 08:32:18   at org.junit.runners.Suite.runChild(Suite.java:27)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=34484&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=0c010d0c-3dec-5bf1-d408-7b18988b1b2b&l=6757



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-27169) PartiallyFinishedSourcesITCase.test hangs on azure

2023-03-03 Thread Roman Khachatryan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Khachatryan resolved FLINK-27169.
---
Fix Version/s: 1.15.5
   Resolution: Fixed

Backported to 1.15 as ddec8d8e144c9cc9adb0a04f41c9667cdd68aabb.

> PartiallyFinishedSourcesITCase.test hangs on azure
> --
>
> Key: FLINK-27169
> URL: https://issues.apache.org/jira/browse/FLINK-27169
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.0
>Reporter: Yun Gao
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.15.5, 1.16.0
>
>
> {code:java}
> Apr 10 08:32:18 "main" #1 prio=5 os_prio=0 tid=0x7f553400b800 nid=0x8345 
> waiting on condition [0x7f553be6]
> Apr 10 08:32:18java.lang.Thread.State: TIMED_WAITING (sleeping)
> Apr 10 08:32:18   at java.lang.Thread.sleep(Native Method)
> Apr 10 08:32:18   at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:145)
> Apr 10 08:32:18   at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:138)
> Apr 10 08:32:18   at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitForSubtasksToFinish(CommonTestUtils.java:291)
> Apr 10 08:32:18   at 
> org.apache.flink.runtime.operators.lifecycle.TestJobExecutor.waitForSubtasksToFinish(TestJobExecutor.java:226)
> Apr 10 08:32:18   at 
> org.apache.flink.runtime.operators.lifecycle.PartiallyFinishedSourcesITCase.test(PartiallyFinishedSourcesITCase.java:138)
> Apr 10 08:32:18   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Apr 10 08:32:18   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Apr 10 08:32:18   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Apr 10 08:32:18   at java.lang.reflect.Method.invoke(Method.java:498)
> Apr 10 08:32:18   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> Apr 10 08:32:18   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Apr 10 08:32:18   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> Apr 10 08:32:18   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Apr 10 08:32:18   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> Apr 10 08:32:18   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> Apr 10 08:32:18   at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> Apr 10 08:32:18   at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> Apr 10 08:32:18   at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> Apr 10 08:32:18   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> Apr 10 08:32:18   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> Apr 10 08:32:18   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> Apr 10 08:32:18   at org.junit.runners.Suite.runChild(Suite.java:128)
> Apr 10 08:32:18   at org.junit.runners.Suite.runChild(Suite.java:27)
> Apr 10 08:32:18   at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=34484&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=0c010d0c-3dec-5bf1-d408-7b18988b1b2b&l=6757



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31319) Inconsistent condition judgement about kafka partitionDiscoveryIntervalMs cause potential bug

2023-03-03 Thread Ran Tao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated FLINK-31319:

Description: 
As kafka option description, partitionDiscoveryIntervalMs <=0 means disabled.  
!image-2023-03-04-01-37-29-360.png|width=781,height=147!

just like start kafka enumerator:

!image-2023-03-04-01-39-20-352.png|width=465,height=311!

but inner 
handlePartitionSplitChanges use error if condition( < 0):

!image-2023-03-04-01-40-44-124.png|width=576,height=237!
 
it will cause noMoreNewPartitionSplits can not be set to true. 
!image-2023-03-04-01-41-55-664.png|width=522,height=610!

Finally cause bounded source can not signalNoMoreSplits and quit.

Besides,Both ends of the if condition should be mutually exclusive.

  was:
As kafka option description, partitionDiscoveryIntervalMs <=0 means disabled.  
!image-2023-03-04-01-37-29-360.png|width=781,height=147!

just like start kafka enumerator:

!image-2023-03-04-01-39-20-352.png|width=465,height=311!

but inner 
handlePartitionSplitChanges use error if condition( < 0):

!image-2023-03-04-01-40-44-124.png|width=576,height=237!
 
it will cause noMoreNewPartitionSplits can not be set to true. Finally cause
!image-2023-03-04-01-41-55-664.png|width=522,height=610!

bounded source can not signalNoMoreSplits.

Besides,Both ends of the if condition should be mutually exclusive.


> Inconsistent condition judgement about kafka partitionDiscoveryIntervalMs 
> cause potential bug
> -
>
> Key: FLINK-31319
> URL: https://issues.apache.org/jira/browse/FLINK-31319
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.16.1
>Reporter: Ran Tao
>Priority: Critical
>  Labels: pull-request-available
> Attachments: image-2023-03-04-01-37-29-360.png, 
> image-2023-03-04-01-39-20-352.png, image-2023-03-04-01-40-44-124.png, 
> image-2023-03-04-01-41-55-664.png
>
>
> As kafka option description, partitionDiscoveryIntervalMs <=0 means disabled. 
>  !image-2023-03-04-01-37-29-360.png|width=781,height=147!
> just like start kafka enumerator:
> !image-2023-03-04-01-39-20-352.png|width=465,height=311!
> but inner 
> handlePartitionSplitChanges use error if condition( < 0):
> !image-2023-03-04-01-40-44-124.png|width=576,height=237!
>  
> it will cause noMoreNewPartitionSplits can not be set to true. 
> !image-2023-03-04-01-41-55-664.png|width=522,height=610!
> Finally cause bounded source can not signalNoMoreSplits and quit.
> Besides,Both ends of the if condition should be mutually exclusive.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31319) Inconsistent condition judgement about kafka partitionDiscoveryIntervalMs cause potential bug

2023-03-03 Thread Ran Tao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated FLINK-31319:

Description: 
As kafka option description, partitionDiscoveryIntervalMs <=0 means disabled.  
!image-2023-03-04-01-37-29-360.png|width=781,height=147!

just like start kafka enumerator:

!image-2023-03-04-01-39-20-352.png|width=465,height=311!

but inner 
handlePartitionSplitChanges use error if condition( < 0):

!image-2023-03-04-01-40-44-124.png|width=576,height=237!
 
it will cause noMoreNewPartitionSplits can not be set to true. Finally cause
!image-2023-03-04-01-41-55-664.png|width=522,height=610!

bounded source can not signalNoMoreSplits.

Besides,Both ends of the if condition should be mutually exclusive.

  was:
As kafka option description, partitionDiscoveryIntervalMs <=0 means disabled.  
!image-2023-03-04-01-37-29-360.png|width=781,height=147!

just like start kafka enumerator:

!image-2023-03-04-01-39-20-352.png|width=465,height=311!

but inner 
handlePartitionSplitChanges use error if condition( < 0):

!image-2023-03-04-01-40-44-124.png|width=576,height=237!
 
it will cause noMoreNewPartitionSplits can not be set to true. Finally cause
!image-2023-03-04-01-41-55-664.png|width=522,height=610!

bounded source can not signalNoMoreSplits.

Besides,Both ends of the if condition should be mutually exclusive. WDYT?


> Inconsistent condition judgement about kafka partitionDiscoveryIntervalMs 
> cause potential bug
> -
>
> Key: FLINK-31319
> URL: https://issues.apache.org/jira/browse/FLINK-31319
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.16.1
>Reporter: Ran Tao
>Priority: Critical
>  Labels: pull-request-available
> Attachments: image-2023-03-04-01-37-29-360.png, 
> image-2023-03-04-01-39-20-352.png, image-2023-03-04-01-40-44-124.png, 
> image-2023-03-04-01-41-55-664.png
>
>
> As kafka option description, partitionDiscoveryIntervalMs <=0 means disabled. 
>  !image-2023-03-04-01-37-29-360.png|width=781,height=147!
> just like start kafka enumerator:
> !image-2023-03-04-01-39-20-352.png|width=465,height=311!
> but inner 
> handlePartitionSplitChanges use error if condition( < 0):
> !image-2023-03-04-01-40-44-124.png|width=576,height=237!
>  
> it will cause noMoreNewPartitionSplits can not be set to true. Finally cause
> !image-2023-03-04-01-41-55-664.png|width=522,height=610!
> bounded source can not signalNoMoreSplits.
> Besides,Both ends of the if condition should be mutually exclusive.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] rkhachatryan merged pull request #22023: [BP-1.15][FLINK-27169][tests] Increase changelog upload timeout in PartiallyFinishedSourcesITCase

2023-03-03 Thread via GitHub


rkhachatryan merged PR #22023:
URL: https://github.com/apache/flink/pull/22023


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] rkhachatryan merged pull request #22088: [BP-1.15][FLINK-31133][tests] Prevent timeouts in PartiallyFinishedSourcesITCase

2023-03-03 Thread via GitHub


rkhachatryan merged PR #22088:
URL: https://github.com/apache/flink/pull/22088


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] rkhachatryan merged pull request #22086: [BP-1.17][FLINK-31133][tests] Prevent timeouts in PartiallyFinishedSourcesITCase

2023-03-03 Thread via GitHub


rkhachatryan merged PR #22086:
URL: https://github.com/apache/flink/pull/22086


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] rkhachatryan merged pull request #22087: [BP-1.16][FLINK-31133][tests] Prevent timeouts in PartiallyFinishedSourcesITCase

2023-03-03 Thread via GitHub


rkhachatryan merged PR #22087:
URL: https://github.com/apache/flink/pull/22087


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-31320) Modify DATE_FORMAT system (built-in) function to accepts DATEs

2023-03-03 Thread James Mcguire (Jira)
James Mcguire created FLINK-31320:
-

 Summary: Modify DATE_FORMAT system (built-in) function to accepts 
DATEs
 Key: FLINK-31320
 URL: https://issues.apache.org/jira/browse/FLINK-31320
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: James Mcguire


The current {{DATE_FORMAT}} function only supports {{{}TIMESTAMP{}}}s. 

(See 
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/functions/systemfunctions/#temporal-functions)

 

Ideally, it should be able to format {{{}DATE{}}}'s as well as {{TIMESTAMPs}}

 

Example usage:
{noformat}
Flink SQL> CREATE TABLE test_table (
>   some_date DATE,
>   object AS JSON_OBJECT(
> KEY 'some_date' VALUE DATE_FORMAT(some_date, '-MM-dd')
>   )
> )
> COMMENT ''
> WITH (
>   'connector'='datagen'
> )
> ;
> 
[ERROR] Could not execute SQL statement. Reason:
org.apache.calcite.sql.validate.SqlValidatorException: Cannot apply 
'DATE_FORMAT' to arguments of type 'DATE_FORMAT(, )'. Supported 
form(s): 'DATE_FORMAT(, )'
'DATE_FORMAT(, )'{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] rkhachatryan commented on a diff in pull request #21923: FLINK-13871: Consolidate volatile status fields in StreamTask

2023-03-03 Thread via GitHub


rkhachatryan commented on code in PR #21923:
URL: https://github.com/apache/flink/pull/21923#discussion_r1124883714


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java:
##
@@ -733,8 +764,7 @@ void restoreInternal() throws Exception {
 // needed
 channelIOExecutor.shutdown();
 
-isRunning = true;
-isRestoring = false;
+taskState.status = TaskState.Status.RUNNING;

Review Comment:
   > we don't need to keep that logic since if the job is canceled already we 
can easily ignore other exceptions which can happen since they can be actually 
because of cancelation
   
   Please note that Task cancellation is not the same as Job cancellation. For 
example, in case of some intermittent failure, the Job will be `RESTARTING`, 
but the tasks will be cancelled and then rescheduled. Ideally, we shouldn't be 
ignoring **all** exceptions during the restart (only interruptions and 
cancellations). Otherwise, it might create a resource leak (job correctness 
won't be affected though).
   
   As for the `running` flag (which will report `true` on `master`; and `false` 
when CAS to `CANCELLED`);
   I briefly skimmed though its usages:
   - `Task.restoreAndInvoke` -> `StreamTask.invoke` -> 
`StreamTask.restoreInternal` >> regression: it will allow 2nd `restoreInternal`
   - ongoing state snapshots will not be aborted by RPC >> OK (they'll be 
aborted anyways by closing resources)
   - local state will not be pruned by abort-checkpoint-RPC >> regression (but 
**likely** will be pruned on the next checkpoint)
   - ...
   
   The above can probably be fixed (in this PR), but my point is that it's hard 
to check all the combinations and their usages.
   If that's true, I'd stick closer to the original behavior by having states 
**unless** we are sure that they are impossible. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] rkhachatryan commented on a diff in pull request #21923: FLINK-13871: Consolidate volatile status fields in StreamTask

2023-03-03 Thread via GitHub


rkhachatryan commented on code in PR #21923:
URL: https://github.com/apache/flink/pull/21923#discussion_r1124883714


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java:
##
@@ -733,8 +764,7 @@ void restoreInternal() throws Exception {
 // needed
 channelIOExecutor.shutdown();
 
-isRunning = true;
-isRestoring = false;
+taskState.status = TaskState.Status.RUNNING;

Review Comment:
   > we don't need to keep that logic since if the job is canceled already we 
can easily ignore other exceptions which can happen since they can be actually 
because of cancelation
   
   Please note that Task cancellation is not the same as Job cancellation. For 
example, in case of some intermittent failure, the Job will be `RESTARTING`, 
but the tasks will be cancelled and then rescheduled. Ideally, we shouldn't be 
ignoring **all** exceptions during the restart (only interruptions and 
cancellations). Otherwise, it might create a resource leak (job correctness 
won't be affected though).
   
   As for the `running` flag (which will report `true` on `master`; and `false` 
when CAS to `CANCELLED`);
   I briefly skimmed though its usages:
   - `Task.restoreAndInvoke` -> `StreamTask.invoke` -> 
`StreamTask.restoreInternal` - regression: it will allow 2nd `restoreInternal`
   - ongoing state snapshots will not be aborted by RPC - that's fine (they'll 
be aborted anyways by closing resources)
   - local state will not be pruned by CHK abort RPC - regression (but 
**likely** will be pruned on the next checkpoint)
   - ...
   
   The above can probably be fixed (in this PR), but my point is that it's hard 
to check all the combinations and their usages.
   If that's true, I'd stick closer to the original behavior by having states 
**unless** we are sure that they are impossible. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] rkhachatryan commented on a diff in pull request #21923: FLINK-13871: Consolidate volatile status fields in StreamTask

2023-03-03 Thread via GitHub


rkhachatryan commented on code in PR #21923:
URL: https://github.com/apache/flink/pull/21923#discussion_r1124883714


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java:
##
@@ -733,8 +764,7 @@ void restoreInternal() throws Exception {
 // needed
 channelIOExecutor.shutdown();
 
-isRunning = true;
-isRestoring = false;
+taskState.status = TaskState.Status.RUNNING;

Review Comment:
   > we don't need to keep that logic since if the job is canceled already we 
can easily ignore other exceptions which can happen since they can be actually 
because of cancelation
   
   Please note that Task cancellation is not the same as Job cancellation. For 
example, in case of some intermittent failure, the Job will be `RESTARTING`, 
but the tasks will be cancelled and then rescheduled. Ideally, we shouldn't be 
ignoring **all** exceptions during the restart (only interruptions and 
cancellations). Otherwise, it might create a resource leak (job correctness 
won't be affected though).
   
   As for the `running` flag (which will report `true` on `master`; and `false` 
when CAS to `CANCELLED`);
   I briefly skimmed though its usages:
   - `Task.restoreAndInvoke` -> `StreamTask.invoke` -> 
`StreamTask.restoreInternal` - regression: it will allow 2nd `restoreInternal`
   - ongoing checkpoints will not be aborted by RPC - that's fine
   - local state will not be pruned by CHK abort RPC - regression (but 
**likely** will be pruned on the next checkpoint)
   - ...
   
   The above can probably be fixed (in this PR), but my point is that it's hard 
to check all the combinations and their usages.
   If that's true, I'd stick closer to the original behavior by having states 
**unless** we are sure that they are impossible. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] rkhachatryan commented on a diff in pull request #21923: FLINK-13871: Consolidate volatile status fields in StreamTask

2023-03-03 Thread via GitHub


rkhachatryan commented on code in PR #21923:
URL: https://github.com/apache/flink/pull/21923#discussion_r1124883714


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java:
##
@@ -733,8 +764,7 @@ void restoreInternal() throws Exception {
 // needed
 channelIOExecutor.shutdown();
 
-isRunning = true;
-isRestoring = false;
+taskState.status = TaskState.Status.RUNNING;

Review Comment:
   > we don't need to keep that logic since if the job is canceled already we 
can easily ignore other exceptions which can happen since they can be actually 
because of cancelation
   
   Please note that Task cancellation is not the same as Job cancellation. For 
example, in case of some intermittent failure, the Job will be `RESTARTING`, 
but the tasks will be cancelled and then rescheduled. Ideally, we shouldn't be 
ignoring **any** exception during restart (just interruptions and 
cancellations). Otherwise, it might create a resource leak (job correctness 
won't be affected though).
   
   As for the `running` flag (which will report `true` on `master`; and `false` 
when CAS to `CANCELLED`);
   I briefly skimmed though its usages:
   - `Task.restoreAndInvoke` -> `StreamTask.invoke` -> 
`StreamTask.restoreInternal` - regression: it will allow 2nd `restoreInternal`
   - ongoing checkpoints will not be aborted by RPC - that's fine
   - local state will not be pruned by CHK abort RPC - regression (but 
**likely** will be pruned on the next checkpoint)
   - ...
   
   The above can probably be fixed (in this PR), but my point is that it's hard 
to check all the combinations and their usages.
   If that's true, I'd stick closer to the original behavior by having states 
**unless** we are sure that they are impossible. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-31006) job is not finished when using pipeline mode to run bounded source like kafka/pulsar

2023-03-03 Thread Ran Tao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696257#comment-17696257
 ] 

Ran Tao edited comment on FLINK-31006 at 3/3/23 6:20 PM:
-

[~syhily] [~renqs] hi, guys. I think Qingsheng's respond in 
github([https://github.com/apache/flink/pull/21909).] is right. I got another 
problem and found this case.

1.Kafka source runs in bounded mode.
2.The enumerator starts and noMoreNewPartitionSplits is set to true, then the 
initial partition discovery is triggered.
3.As the partition discovery runs asynchronously in the worker thread, it's 
possible that a reader can register on the enumerator before the partition 
discovery finishes.

If we set `noMoreNewPartitionSplits = true;` when partitionDiscovery is 
disabled.
because *context.callAsync* to get partitions is a async call. If call is not 
finished, it will quit early. it means we can not consume bounded source. 
However we expect it should read first partitions and consume and then quit.

so, i think currently the PR can not work. however the issue may be a special 
case like race condition we should keep on eye.


was (Author: lemonjing):
[~syhily] [~renqs] hi, guys. I think Qingsheng's respond in 
github([https://github.com/apache/flink/pull/21909).] is right. I got another 
problem and found this case.

1.Kafka source runs in bounded mode.
2.The enumerator starts and noMoreNewPartitionSplits is set to true, then the 
initial partition discovery is triggered.
3.As the partition discovery runs asynchronously in the worker thread, it's 
possible that a reader can register on the enumerator before the partition 
discovery finishes.

If we set `noMoreNewPartitionSplits = true;` when partitionDiscovery is 
disabled.
because *context.callAsync* to get partitions is a async call. If call is not 
finished, it will quit early. it means we can not consume bounded source. 
However we expect it should read first partitions and consume and then quit.

so, i think currently the PR can not work.

> job is not finished when using pipeline mode to run bounded source like 
> kafka/pulsar
> 
>
> Key: FLINK-31006
> URL: https://issues.apache.org/jira/browse/FLINK-31006
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Connectors / Pulsar
>Affects Versions: 1.17.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
> Attachments: image-2023-02-10-13-20-52-890.png, 
> image-2023-02-10-13-23-38-430.png, image-2023-02-10-13-24-46-929.png, 
> image-2023-03-04-01-04-18-658.png, image-2023-03-04-01-05-25-335.png, 
> image-2023-03-04-01-07-04-927.png, image-2023-03-04-01-07-36-168.png, 
> image-2023-03-04-01-08-29-042.png, image-2023-03-04-01-09-24-199.png
>
>
> when i do failover works like kill jm/tm when using  pipeline mode to run 
> bounded source like kafka, i found job is not finished, when every partition 
> data has consumed.
>  
> After dig into code, i found this logical not run when JM recover. the 
> partition infos are not changed. so noMoreNewPartitionSplits is not set to 
> true. then this will not run 
>  
> !image-2023-02-10-13-23-38-430.png!
>  
> !image-2023-02-10-13-24-46-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31006) job is not finished when using pipeline mode to run bounded source like kafka/pulsar

2023-03-03 Thread Ran Tao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696257#comment-17696257
 ] 

Ran Tao edited comment on FLINK-31006 at 3/3/23 6:18 PM:
-

[~syhily] [~renqs] hi, guys. I think Qingsheng's respond in 
github([https://github.com/apache/flink/pull/21909).] is right. I got another 
problem and found this case.

1.Kafka source runs in bounded mode.
2.The enumerator starts and noMoreNewPartitionSplits is set to true, then the 
initial partition discovery is triggered.
3.As the partition discovery runs asynchronously in the worker thread, it's 
possible that a reader can register on the enumerator before the partition 
discovery finishes.

If we set `noMoreNewPartitionSplits = true;` when partitionDiscovery is 
disabled.
because *context.callAsync* to get partitions is a async call. If call is not 
finished, it will quit early. it means we can not consume bounded source. 
However we expect it should read first partitions and consume and then quit.

so, i think currently the PR can not work.


was (Author: lemonjing):
[~syhily] [~renqs] hi, guys. I think Qingsheng's respond in 
github([https://github.com/apache/flink/pull/21909).] is right. I got another 
thing and found this case.

1.Kafka source runs in bounded mode.
2.The enumerator starts and noMoreNewPartitionSplits is set to true, then the 
initial partition discovery is triggered.
3.As the partition discovery runs asynchronously in the worker thread, it's 
possible that a reader can register on the enumerator before the partition 
discovery finishes.

If we set `noMoreNewPartitionSplits = true;` when partitionDiscovery is 
disabled.
because *context.callAsync* to get partitions is a async call. If call is not 
finished, it will quit early. it means we can not consume bounded source. 
However we expect it should read first partitions and consume and then quit.

so, i think currently the PR can not work.

> job is not finished when using pipeline mode to run bounded source like 
> kafka/pulsar
> 
>
> Key: FLINK-31006
> URL: https://issues.apache.org/jira/browse/FLINK-31006
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Connectors / Pulsar
>Affects Versions: 1.17.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
> Attachments: image-2023-02-10-13-20-52-890.png, 
> image-2023-02-10-13-23-38-430.png, image-2023-02-10-13-24-46-929.png, 
> image-2023-03-04-01-04-18-658.png, image-2023-03-04-01-05-25-335.png, 
> image-2023-03-04-01-07-04-927.png, image-2023-03-04-01-07-36-168.png, 
> image-2023-03-04-01-08-29-042.png, image-2023-03-04-01-09-24-199.png
>
>
> when i do failover works like kill jm/tm when using  pipeline mode to run 
> bounded source like kafka, i found job is not finished, when every partition 
> data has consumed.
>  
> After dig into code, i found this logical not run when JM recover. the 
> partition infos are not changed. so noMoreNewPartitionSplits is not set to 
> true. then this will not run 
>  
> !image-2023-02-10-13-23-38-430.png!
>  
> !image-2023-02-10-13-24-46-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31006) job is not finished when using pipeline mode to run bounded source like kafka/pulsar

2023-03-03 Thread Ran Tao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696257#comment-17696257
 ] 

Ran Tao edited comment on FLINK-31006 at 3/3/23 6:17 PM:
-

[~syhily] [~renqs] hi, guys. I think Qingsheng's respond in 
github([https://github.com/apache/flink/pull/21909).] is right. I got another 
thing and found this case.

1.Kafka source runs in bounded mode.
2.The enumerator starts and noMoreNewPartitionSplits is set to true, then the 
initial partition discovery is triggered.
3.As the partition discovery runs asynchronously in the worker thread, it's 
possible that a reader can register on the enumerator before the partition 
discovery finishes.

If we set `noMoreNewPartitionSplits = true;` when partitionDiscovery is 
disabled.
because *context.callAsync* to get partitions is a async call. If call is not 
finished, it will quit early. it means we can not consume bounded source. 
However we expect it should read first partitions and consume and then quit.

so, i think currently the PR can not work.


was (Author: lemonjing):
[~syhily] [~renqs] hi, guys. I think Qingsheng's respond in 
github([https://github.com/apache/flink/pull/21909).] is right. I got some same 
thing.

1.Kafka source runs in bounded mode.
2.The enumerator starts and noMoreNewPartitionSplits is set to true, then the 
initial partition discovery is triggered.
3.As the partition discovery runs asynchronously in the worker thread, it's 
possible that a reader can register on the enumerator before the partition 
discovery finishes.

If we set `noMoreNewPartitionSplits = true;` when partitionDiscovery is 
disabled.
because *context.callAsync* to get partitions is a async call. If call is not 
finished, it will quit early. it means we can not consume bounded source. 
However we expect it should read first partitions and consume and then quit.

so, i think currently the PR can not work.

> job is not finished when using pipeline mode to run bounded source like 
> kafka/pulsar
> 
>
> Key: FLINK-31006
> URL: https://issues.apache.org/jira/browse/FLINK-31006
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Connectors / Pulsar
>Affects Versions: 1.17.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
> Attachments: image-2023-02-10-13-20-52-890.png, 
> image-2023-02-10-13-23-38-430.png, image-2023-02-10-13-24-46-929.png, 
> image-2023-03-04-01-04-18-658.png, image-2023-03-04-01-05-25-335.png, 
> image-2023-03-04-01-07-04-927.png, image-2023-03-04-01-07-36-168.png, 
> image-2023-03-04-01-08-29-042.png, image-2023-03-04-01-09-24-199.png
>
>
> when i do failover works like kill jm/tm when using  pipeline mode to run 
> bounded source like kafka, i found job is not finished, when every partition 
> data has consumed.
>  
> After dig into code, i found this logical not run when JM recover. the 
> partition infos are not changed. so noMoreNewPartitionSplits is not set to 
> true. then this will not run 
>  
> !image-2023-02-10-13-23-38-430.png!
>  
> !image-2023-02-10-13-24-46-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31006) job is not finished when using pipeline mode to run bounded source like kafka/pulsar

2023-03-03 Thread Ran Tao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696257#comment-17696257
 ] 

Ran Tao commented on FLINK-31006:
-

[~syhily] [~renqs] hi, guys. I think Qingsheng's respond in 
github([https://github.com/apache/flink/pull/21909).] is right. I got some same 
thing.

1.Kafka source runs in bounded mode.
2.The enumerator starts and noMoreNewPartitionSplits is set to true, then the 
initial partition discovery is triggered.
3.As the partition discovery runs asynchronously in the worker thread, it's 
possible that a reader can register on the enumerator before the partition 
discovery finishes.

If we set `noMoreNewPartitionSplits = true;` when partitionDiscovery is 
disabled.
because *context.callAsync* to get partitions is a async call. If call is not 
finished, it will quit early. it means we can not consume bounded source. 
However we expect it should read first partitions and consume and then quit.

so, i think currently the PR can not work.

> job is not finished when using pipeline mode to run bounded source like 
> kafka/pulsar
> 
>
> Key: FLINK-31006
> URL: https://issues.apache.org/jira/browse/FLINK-31006
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Connectors / Pulsar
>Affects Versions: 1.17.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
> Attachments: image-2023-02-10-13-20-52-890.png, 
> image-2023-02-10-13-23-38-430.png, image-2023-02-10-13-24-46-929.png, 
> image-2023-03-04-01-04-18-658.png, image-2023-03-04-01-05-25-335.png, 
> image-2023-03-04-01-07-04-927.png, image-2023-03-04-01-07-36-168.png, 
> image-2023-03-04-01-08-29-042.png, image-2023-03-04-01-09-24-199.png
>
>
> when i do failover works like kill jm/tm when using  pipeline mode to run 
> bounded source like kafka, i found job is not finished, when every partition 
> data has consumed.
>  
> After dig into code, i found this logical not run when JM recover. the 
> partition infos are not changed. so noMoreNewPartitionSplits is not set to 
> true. then this will not run 
>  
> !image-2023-02-10-13-23-38-430.png!
>  
> !image-2023-02-10-13-24-46-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #22091: [FLINK-31319][connectors/kafka] Fix kafka partitionDiscoveryIntervalMs error condition check cause not set noMoreNewPartitionSplits

2023-03-03 Thread via GitHub


flinkbot commented on PR #22091:
URL: https://github.com/apache/flink/pull/22091#issuecomment-1453911937

   
   ## CI report:
   
   * 63b51c7bf01363f78b8de1d9131fc84d0c588bda UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31319) Inconsistent condition judgement about kafka partitionDiscoveryIntervalMs cause potential bug

2023-03-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-31319:
---
Labels: pull-request-available  (was: )

> Inconsistent condition judgement about kafka partitionDiscoveryIntervalMs 
> cause potential bug
> -
>
> Key: FLINK-31319
> URL: https://issues.apache.org/jira/browse/FLINK-31319
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.16.1
>Reporter: Ran Tao
>Priority: Critical
>  Labels: pull-request-available
> Attachments: image-2023-03-04-01-37-29-360.png, 
> image-2023-03-04-01-39-20-352.png, image-2023-03-04-01-40-44-124.png, 
> image-2023-03-04-01-41-55-664.png
>
>
> As kafka option description, partitionDiscoveryIntervalMs <=0 means disabled. 
>  !image-2023-03-04-01-37-29-360.png|width=781,height=147!
> just like start kafka enumerator:
> !image-2023-03-04-01-39-20-352.png|width=465,height=311!
> but inner 
> handlePartitionSplitChanges use error if condition( < 0):
> !image-2023-03-04-01-40-44-124.png|width=576,height=237!
>  
> it will cause noMoreNewPartitionSplits can not be set to true. Finally cause
> !image-2023-03-04-01-41-55-664.png|width=522,height=610!
> bounded source can not signalNoMoreSplits.
> Besides,Both ends of the if condition should be mutually exclusive. WDYT?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] chucheng92 opened a new pull request, #22091: [FLINK-31319][connectors/kafka] Fix kafka partitionDiscoveryIntervalMs error condition check cause not set noMoreNewPartitionSplits

2023-03-03 Thread via GitHub


chucheng92 opened a new pull request, #22091:
URL: https://github.com/apache/flink/pull/22091

   ## What is the purpose of the change
   
   Fix error kafka partitionDiscoveryIntervalMs checking for noMoreNewSplits
   
   ## Brief change log
   
   Correct partitionDiscoveryIntervalMs checking for noMoreNewSplits with <=0
   
   ## Verifying this change
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
   - Dependencies (does it add or upgrade a dependency): no
   - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
   - The serializers: no
   - The runtime per-record code paths (performance sensitive): no
   - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
   - The S3 file system connector: no
   
   ## Documentation
   
   - Does this pull request introduce a new feature?  no
   - If yes, how is the feature documented? docs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31319) Inconsistent condition judgement about kafka partitionDiscoveryIntervalMs cause potential bug

2023-03-03 Thread Ran Tao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated FLINK-31319:

Description: 
As kafka option description, partitionDiscoveryIntervalMs <=0 means disabled.  
!image-2023-03-04-01-37-29-360.png|width=781,height=147!

just like start kafka enumerator:

!image-2023-03-04-01-39-20-352.png|width=465,height=311!

but inner 
handlePartitionSplitChanges use error if condition( < 0):

!image-2023-03-04-01-40-44-124.png|width=576,height=237!
 
it will cause noMoreNewPartitionSplits can not be set to true. Finally cause
!image-2023-03-04-01-41-55-664.png|width=522,height=610!

bounded source can not signalNoMoreSplits.

Besides,Both ends of the if condition should be mutually exclusive. WDYT?

  was:
As kafka option description, partitionDiscoveryIntervalMs <=0 means disabled.  
!image-2023-03-04-01-37-29-360.png|width=781,height=147!

just like start kafka enum:

!image-2023-03-04-01-39-20-352.png|width=465,height=311!

but inner 
handlePartitionSplitChanges use error if condition( < 0):


!image-2023-03-04-01-40-44-124.png|width=576,height=237!
 
it will cause noMoreNewPartitionSplits can not be set to true. Finally cause
!image-2023-03-04-01-41-55-664.png|width=522,height=610!

bounded source can not signalNoMoreSplits.

Anyway,Both ends of the if condition should be mutually exclusive. WDYT?


> Inconsistent condition judgement about kafka partitionDiscoveryIntervalMs 
> cause potential bug
> -
>
> Key: FLINK-31319
> URL: https://issues.apache.org/jira/browse/FLINK-31319
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.16.1
>Reporter: Ran Tao
>Priority: Critical
> Attachments: image-2023-03-04-01-37-29-360.png, 
> image-2023-03-04-01-39-20-352.png, image-2023-03-04-01-40-44-124.png, 
> image-2023-03-04-01-41-55-664.png
>
>
> As kafka option description, partitionDiscoveryIntervalMs <=0 means disabled. 
>  !image-2023-03-04-01-37-29-360.png|width=781,height=147!
> just like start kafka enumerator:
> !image-2023-03-04-01-39-20-352.png|width=465,height=311!
> but inner 
> handlePartitionSplitChanges use error if condition( < 0):
> !image-2023-03-04-01-40-44-124.png|width=576,height=237!
>  
> it will cause noMoreNewPartitionSplits can not be set to true. Finally cause
> !image-2023-03-04-01-41-55-664.png|width=522,height=610!
> bounded source can not signalNoMoreSplits.
> Besides,Both ends of the if condition should be mutually exclusive. WDYT?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (FLINK-31006) job is not finished when using pipeline mode to run bounded source like kafka/pulsar

2023-03-03 Thread Ran Tao (Jira)


[ https://issues.apache.org/jira/browse/FLINK-31006 ]


Ran Tao deleted comment on FLINK-31006:
-

was (Author: lemonjing):
Hi, guys. I think the PR is not work and may cause another bug. 
If we set `noMoreNewPartitionSplits = true;` when partitionDiscovery is 
disabled.

because *context.callAsync to get partitions is a async call.* If call is not 
finished, then code enter addReader:
!image-2023-03-04-01-07-04-927.png|width=588,height=381!
!image-2023-03-04-01-09-24-199.png|width=581,height=176!
!image-2023-03-04-01-08-29-042.png|width=596,height=636!

it will quit early. it means we can not consume bounded source. 
However we expect it should read first partitions and consume and then quit.
WDYT?
 

> job is not finished when using pipeline mode to run bounded source like 
> kafka/pulsar
> 
>
> Key: FLINK-31006
> URL: https://issues.apache.org/jira/browse/FLINK-31006
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Connectors / Pulsar
>Affects Versions: 1.17.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
> Attachments: image-2023-02-10-13-20-52-890.png, 
> image-2023-02-10-13-23-38-430.png, image-2023-02-10-13-24-46-929.png, 
> image-2023-03-04-01-04-18-658.png, image-2023-03-04-01-05-25-335.png, 
> image-2023-03-04-01-07-04-927.png, image-2023-03-04-01-07-36-168.png, 
> image-2023-03-04-01-08-29-042.png, image-2023-03-04-01-09-24-199.png
>
>
> when i do failover works like kill jm/tm when using  pipeline mode to run 
> bounded source like kafka, i found job is not finished, when every partition 
> data has consumed.
>  
> After dig into code, i found this logical not run when JM recover. the 
> partition infos are not changed. so noMoreNewPartitionSplits is not set to 
> true. then this will not run 
>  
> !image-2023-02-10-13-23-38-430.png!
>  
> !image-2023-02-10-13-24-46-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31319) Inconsistent condition judgement about kafka partitionDiscoveryIntervalMs cause potential bug

2023-03-03 Thread Ran Tao (Jira)
Ran Tao created FLINK-31319:
---

 Summary: Inconsistent condition judgement about kafka 
partitionDiscoveryIntervalMs cause potential bug
 Key: FLINK-31319
 URL: https://issues.apache.org/jira/browse/FLINK-31319
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.16.1
Reporter: Ran Tao
 Attachments: image-2023-03-04-01-37-29-360.png, 
image-2023-03-04-01-39-20-352.png, image-2023-03-04-01-40-44-124.png, 
image-2023-03-04-01-41-55-664.png

As kafka option description, partitionDiscoveryIntervalMs <=0 means disabled.  
!image-2023-03-04-01-37-29-360.png|width=781,height=147!

just like start kafka enum:

!image-2023-03-04-01-39-20-352.png|width=465,height=311!

but inner 
handlePartitionSplitChanges use error if condition( < 0):


!image-2023-03-04-01-40-44-124.png|width=576,height=237!
 
it will cause noMoreNewPartitionSplits can not be set to true. Finally cause
!image-2023-03-04-01-41-55-664.png|width=522,height=610!

bounded source can not signalNoMoreSplits.

Anyway,Both ends of the if condition should be mutually exclusive. WDYT?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31318) Do not scale down operators while processing backlog

2023-03-03 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-31318:
--

 Summary: Do not scale down operators while processing backlog
 Key: FLINK-31318
 URL: https://issues.apache.org/jira/browse/FLINK-31318
 Project: Flink
  Issue Type: Improvement
  Components: Autoscaler, Kubernetes Operator
Reporter: Gyula Fora
Assignee: Gyula Fora


Currently the autoscaler may try to scale down some operators even when the job 
is struggling to catch up. 

This can lead to a vicious cycle where the backlog keeps increasing. It makes 
sense to hold off scale down decisions until the job has caught up.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #22090: [Flink 31170] [docs]The spelling error of the document word causes sql to fail to execute

2023-03-03 Thread via GitHub


flinkbot commented on PR #22090:
URL: https://github.com/apache/flink/pull/22090#issuecomment-1453861541

   
   ## CI report:
   
   * 2b7853f7bf56569a1f1b1d9c43db55f580b7f933 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] BoYiZhang opened a new pull request, #22090: [Flink 31170] [docs]The spelling error of the document word causes sql to fail to execute

2023-03-03 Thread via GitHub


BoYiZhang opened a new pull request, #22090:
URL: https://github.com/apache/flink/pull/22090

   
   ## What is the purpose of the change
   
 - *Fix spelling errors in the create table statement*
   
   
   ## Brief change log
   
 - *The create table statement provided by the document is incorrect, and 
an error is reported during execution*
   
   
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (no )
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? ( no)
 - If yes, how is the feature documented? (not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-31006) job is not finished when using pipeline mode to run bounded source like kafka/pulsar

2023-03-03 Thread Ran Tao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696234#comment-17696234
 ] 

Ran Tao commented on FLINK-31006:
-

Hi, guys. I think the PR is not work and may cause another bug. 
If we set `noMoreNewPartitionSplits = true;` when partitionDiscovery is 
disabled.

because *context.callAsync to get partitions is a async call.* If call is not 
finished, then code enter addReader:
!image-2023-03-04-01-07-04-927.png|width=588,height=381!
!image-2023-03-04-01-09-24-199.png|width=581,height=176!
!image-2023-03-04-01-08-29-042.png|width=596,height=636!

it will quit early. it means we can not consume bounded source. 
However we expect it should read first partitions and consume and then quit.
WDYT?
 

> job is not finished when using pipeline mode to run bounded source like 
> kafka/pulsar
> 
>
> Key: FLINK-31006
> URL: https://issues.apache.org/jira/browse/FLINK-31006
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Connectors / Pulsar
>Affects Versions: 1.17.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
> Attachments: image-2023-02-10-13-20-52-890.png, 
> image-2023-02-10-13-23-38-430.png, image-2023-02-10-13-24-46-929.png, 
> image-2023-03-04-01-04-18-658.png, image-2023-03-04-01-05-25-335.png, 
> image-2023-03-04-01-07-04-927.png, image-2023-03-04-01-07-36-168.png, 
> image-2023-03-04-01-08-29-042.png, image-2023-03-04-01-09-24-199.png
>
>
> when i do failover works like kill jm/tm when using  pipeline mode to run 
> bounded source like kafka, i found job is not finished, when every partition 
> data has consumed.
>  
> After dig into code, i found this logical not run when JM recover. the 
> partition infos are not changed. so noMoreNewPartitionSplits is not set to 
> true. then this will not run 
>  
> !image-2023-02-10-13-23-38-430.png!
>  
> !image-2023-02-10-13-24-46-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] BoYiZhang closed pull request #22006: [Flink 31170] [docs]The spelling error of the document word causes sql to fail to execute

2023-03-03 Thread via GitHub


BoYiZhang closed pull request #22006: [Flink 31170] [docs]The spelling error of 
the document word causes sql to fail to execute 
URL: https://github.com/apache/flink/pull/22006


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] BoYiZhang commented on pull request #22006: [Flink 31170] [docs]The spelling error of the document word causes sql to fail to execute

2023-03-03 Thread via GitHub


BoYiZhang commented on PR #22006:
URL: https://github.com/apache/flink/pull/22006#issuecomment-1453845141

   ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31006) job is not finished when using pipeline mode to run bounded source like kafka/pulsar

2023-03-03 Thread Ran Tao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated FLINK-31006:

Attachment: image-2023-03-04-01-09-24-199.png

> job is not finished when using pipeline mode to run bounded source like 
> kafka/pulsar
> 
>
> Key: FLINK-31006
> URL: https://issues.apache.org/jira/browse/FLINK-31006
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Connectors / Pulsar
>Affects Versions: 1.17.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
> Attachments: image-2023-02-10-13-20-52-890.png, 
> image-2023-02-10-13-23-38-430.png, image-2023-02-10-13-24-46-929.png, 
> image-2023-03-04-01-04-18-658.png, image-2023-03-04-01-05-25-335.png, 
> image-2023-03-04-01-07-04-927.png, image-2023-03-04-01-07-36-168.png, 
> image-2023-03-04-01-08-29-042.png, image-2023-03-04-01-09-24-199.png
>
>
> when i do failover works like kill jm/tm when using  pipeline mode to run 
> bounded source like kafka, i found job is not finished, when every partition 
> data has consumed.
>  
> After dig into code, i found this logical not run when JM recover. the 
> partition infos are not changed. so noMoreNewPartitionSplits is not set to 
> true. then this will not run 
>  
> !image-2023-02-10-13-23-38-430.png!
>  
> !image-2023-02-10-13-24-46-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31006) job is not finished when using pipeline mode to run bounded source like kafka/pulsar

2023-03-03 Thread Ran Tao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated FLINK-31006:

Attachment: image-2023-03-04-01-08-29-042.png

> job is not finished when using pipeline mode to run bounded source like 
> kafka/pulsar
> 
>
> Key: FLINK-31006
> URL: https://issues.apache.org/jira/browse/FLINK-31006
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Connectors / Pulsar
>Affects Versions: 1.17.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
> Attachments: image-2023-02-10-13-20-52-890.png, 
> image-2023-02-10-13-23-38-430.png, image-2023-02-10-13-24-46-929.png, 
> image-2023-03-04-01-04-18-658.png, image-2023-03-04-01-05-25-335.png, 
> image-2023-03-04-01-07-04-927.png, image-2023-03-04-01-07-36-168.png, 
> image-2023-03-04-01-08-29-042.png
>
>
> when i do failover works like kill jm/tm when using  pipeline mode to run 
> bounded source like kafka, i found job is not finished, when every partition 
> data has consumed.
>  
> After dig into code, i found this logical not run when JM recover. the 
> partition infos are not changed. so noMoreNewPartitionSplits is not set to 
> true. then this will not run 
>  
> !image-2023-02-10-13-23-38-430.png!
>  
> !image-2023-02-10-13-24-46-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31249) Checkpoint timeout mechanism fails when finalizeCheckpoint is stuck

2023-03-03 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696230#comment-17696230
 ] 

Roman Khachatryan commented on FLINK-31249:
---

Allowing to trigger a new checkpoint without unblocking the other (main) thread 
doesn't make much sense to me: at least to process the ACKs for that new 
checkpoint, the main thread is required.

 

Ideally, all IO should be done in a separate thread, but we're not there yet. I 
don't see a way to interrupt writing metadata generically (for any filesystem).

Rather, specific FS implementations can be configured to tinder out too long 
requests.

 

Besides that, the same filesystem usually stores state backend snapshots and 
this metadata. When overloaded, it's more likely that state backend snapshots 
will time out first.

> Checkpoint timeout mechanism fails when finalizeCheckpoint is stuck
> ---
>
> Key: FLINK-31249
> URL: https://issues.apache.org/jira/browse/FLINK-31249
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.6, 1.16.0
>Reporter: Renxiang Zhou
>Priority: Major
> Fix For: 1.18.0
>
> Attachments: image-2023-02-28-11-25-03-637.png, 
> image-2023-02-28-12-04-35-178.png, image-2023-02-28-12-17-19-607.png
>
>
> When jobmanager receives all ACKs of tasks, it will finalize the pending 
> checkpoint to a completed checkpoint. Currently JM finalizes the pending 
> checkpoint with holding the checkpoint coordinator lock.
> When a DFS failure occurs, the {{jobmanager-future}} thread may be blocked at 
> finalizing the pending checkpoint.
> !image-2023-02-28-12-17-19-607.png|width=1010,height=244!
> And then the next checkpoint is triggered, the {{Checkpoint Timer}} thread 
> waits for the lock to be released. 
> !image-2023-02-28-11-25-03-637.png|width=1144,height=248!
> If the previous checkpoint times out, the {{Checkpoint Timer}} will not 
> execute the timeout event since it is blocked at waiting for the lock. As a 
> result, the previous checkpoint cannot be cancelled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31006) job is not finished when using pipeline mode to run bounded source like kafka/pulsar

2023-03-03 Thread Ran Tao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated FLINK-31006:

Attachment: image-2023-03-04-01-07-04-927.png

> job is not finished when using pipeline mode to run bounded source like 
> kafka/pulsar
> 
>
> Key: FLINK-31006
> URL: https://issues.apache.org/jira/browse/FLINK-31006
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Connectors / Pulsar
>Affects Versions: 1.17.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
> Attachments: image-2023-02-10-13-20-52-890.png, 
> image-2023-02-10-13-23-38-430.png, image-2023-02-10-13-24-46-929.png, 
> image-2023-03-04-01-04-18-658.png, image-2023-03-04-01-05-25-335.png, 
> image-2023-03-04-01-07-04-927.png, image-2023-03-04-01-07-36-168.png
>
>
> when i do failover works like kill jm/tm when using  pipeline mode to run 
> bounded source like kafka, i found job is not finished, when every partition 
> data has consumed.
>  
> After dig into code, i found this logical not run when JM recover. the 
> partition infos are not changed. so noMoreNewPartitionSplits is not set to 
> true. then this will not run 
>  
> !image-2023-02-10-13-23-38-430.png!
>  
> !image-2023-02-10-13-24-46-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31006) job is not finished when using pipeline mode to run bounded source like kafka/pulsar

2023-03-03 Thread Ran Tao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated FLINK-31006:

Attachment: image-2023-03-04-01-07-36-168.png

> job is not finished when using pipeline mode to run bounded source like 
> kafka/pulsar
> 
>
> Key: FLINK-31006
> URL: https://issues.apache.org/jira/browse/FLINK-31006
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Connectors / Pulsar
>Affects Versions: 1.17.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
> Attachments: image-2023-02-10-13-20-52-890.png, 
> image-2023-02-10-13-23-38-430.png, image-2023-02-10-13-24-46-929.png, 
> image-2023-03-04-01-04-18-658.png, image-2023-03-04-01-05-25-335.png, 
> image-2023-03-04-01-07-04-927.png, image-2023-03-04-01-07-36-168.png
>
>
> when i do failover works like kill jm/tm when using  pipeline mode to run 
> bounded source like kafka, i found job is not finished, when every partition 
> data has consumed.
>  
> After dig into code, i found this logical not run when JM recover. the 
> partition infos are not changed. so noMoreNewPartitionSplits is not set to 
> true. then this will not run 
>  
> !image-2023-02-10-13-23-38-430.png!
>  
> !image-2023-02-10-13-24-46-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] akalash commented on a diff in pull request #21923: FLINK-13871: Consolidate volatile status fields in StreamTask

2023-03-03 Thread via GitHub


akalash commented on code in PR #21923:
URL: https://github.com/apache/flink/pull/21923#discussion_r1124761808


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java:
##
@@ -733,8 +764,7 @@ void restoreInternal() throws Exception {
 // needed
 channelIOExecutor.shutdown();
 
-isRunning = true;
-isRestoring = false;
+taskState.status = TaskState.Status.RUNNING;

Review Comment:
   Potentially we can have CANCELLING status but right now I don't see where we 
can actually use it. For example, if we consider the above-described scenario, 
we have a situation when canceled=true at the same time as isRestoring=true as 
well. isRestoring we use only inside `handleAsyncException` and if we want to 
keep this logic we indeed need some extra status like CANCELLING but I actually 
think that we don't need to keep that logic since if the job is canceled 
already we can easily ignore other exceptions which can happen since they can 
be actually because of cancelation. I have the same thoughts about 
canceled=true + isRunning=true.(I don't see where we should execute logic 
isRunning=true if the canceled is true already.)
   Long story short, if we indeed understand where we need CANCELLING status 
let's add it but if we need it only for keeping the old behavior let's discuss 
this old behavior since I am not really sure that the old behavior is correct.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31006) job is not finished when using pipeline mode to run bounded source like kafka/pulsar

2023-03-03 Thread Ran Tao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated FLINK-31006:

Attachment: image-2023-03-04-01-05-25-335.png

> job is not finished when using pipeline mode to run bounded source like 
> kafka/pulsar
> 
>
> Key: FLINK-31006
> URL: https://issues.apache.org/jira/browse/FLINK-31006
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Connectors / Pulsar
>Affects Versions: 1.17.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
> Attachments: image-2023-02-10-13-20-52-890.png, 
> image-2023-02-10-13-23-38-430.png, image-2023-02-10-13-24-46-929.png, 
> image-2023-03-04-01-04-18-658.png, image-2023-03-04-01-05-25-335.png
>
>
> when i do failover works like kill jm/tm when using  pipeline mode to run 
> bounded source like kafka, i found job is not finished, when every partition 
> data has consumed.
>  
> After dig into code, i found this logical not run when JM recover. the 
> partition infos are not changed. so noMoreNewPartitionSplits is not set to 
> true. then this will not run 
>  
> !image-2023-02-10-13-23-38-430.png!
>  
> !image-2023-02-10-13-24-46-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31006) job is not finished when using pipeline mode to run bounded source like kafka/pulsar

2023-03-03 Thread Ran Tao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated FLINK-31006:

Attachment: image-2023-03-04-01-04-18-658.png

> job is not finished when using pipeline mode to run bounded source like 
> kafka/pulsar
> 
>
> Key: FLINK-31006
> URL: https://issues.apache.org/jira/browse/FLINK-31006
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka, Connectors / Pulsar
>Affects Versions: 1.17.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0
>
> Attachments: image-2023-02-10-13-20-52-890.png, 
> image-2023-02-10-13-23-38-430.png, image-2023-02-10-13-24-46-929.png, 
> image-2023-03-04-01-04-18-658.png
>
>
> when i do failover works like kill jm/tm when using  pipeline mode to run 
> bounded source like kafka, i found job is not finished, when every partition 
> data has consumed.
>  
> After dig into code, i found this logical not run when JM recover. the 
> partition infos are not changed. so noMoreNewPartitionSplits is not set to 
> true. then this will not run 
>  
> !image-2023-02-10-13-23-38-430.png!
>  
> !image-2023-02-10-13-24-46-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-31284) Increase KerberosLoginProvider test coverage

2023-03-03 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-31284.
--
Fix Version/s: 1.18.0
   Resolution: Fixed

merged to master 7f5240c9f912ec68c0b18b0022147eaa27992e4f

> Increase KerberosLoginProvider test coverage
> 
>
> Key: FLINK-31284
> URL: https://issues.apache.org/jira/browse/FLINK-31284
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Tests
>Affects Versions: 1.17.0
>Reporter: Gabor Somogyi
>Assignee: Gabor Somogyi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] gyfora merged pull request #22061: [FLINK-31284][tests] Increase KerberosLoginProvider test coverage

2023-03-03 Thread via GitHub


gyfora merged PR #22061:
URL: https://github.com/apache/flink/pull/22061


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-24379) Support AWS Glue Schema Registry Avro for Table API

2023-03-03 Thread Karthi Thyagarajan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696210#comment-17696210
 ] 

Karthi Thyagarajan commented on FLINK-24379:


FYI - just letting folks know that I'm picking up this issue and will create a 
new PR shortly

> Support AWS Glue Schema Registry Avro for Table API
> ---
>
> Key: FLINK-24379
> URL: https://issues.apache.org/jira/browse/FLINK-24379
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Affects Versions: 1.14.0
>Reporter: Brad Davis
>Assignee: Karthi Thyagarajan
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: aws-connector-4.1.0
>
>
> Unlike most (all?) of the other Avro formats, the AWS Glue Schema Registry 
> version doesn't include a 
> META-INF/services/org.apache.flink.table.factories.Factory resource or a 
> class implementing 
> org.apache.flink.table.factories.DeserializationFormatFactory and 
> org.apache.flink.table.factories.SerializationFormatFactory.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31317) Introduce data structures for managing resource requirements of a job

2023-03-03 Thread Jira


 [ 
https://issues.apache.org/jira/browse/FLINK-31317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Morávek updated FLINK-31317:
--
Summary: Introduce data structures for managing resource requirements of a 
job  (was: Introduce data structure for managing resource requirements of a job)

> Introduce data structures for managing resource requirements of a job
> -
>
> Key: FLINK-31317
> URL: https://issues.apache.org/jira/browse/FLINK-31317
> Project: Flink
>  Issue Type: Sub-task
>Reporter: David Morávek
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] rkhachatryan commented on a diff in pull request #21923: FLINK-13871: Consolidate volatile status fields in StreamTask

2023-03-03 Thread via GitHub


rkhachatryan commented on code in PR #21923:
URL: https://github.com/apache/flink/pull/21923#discussion_r1124691916


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java:
##
@@ -733,8 +764,7 @@ void restoreInternal() throws Exception {
 // needed
 channelIOExecutor.shutdown();
 
-isRunning = true;
-isRestoring = false;
+taskState.status = TaskState.Status.RUNNING;

Review Comment:
   > now all transitions are correct
   
   I'd add CANCELLING state to reflect the above combination of flags.
   
   > I also don't think that we should get back to 4 flags since we already 
spend a lot of time understanding which combinations of them are really possible
   
   Makes sense. I'm not insisting on any option. As for the knowledge, should 
we encode it as allowed state transitions? I believe it was there previously in 
the PR, but then was deleted. 
   
   > I still think that we should even get rid of failing
   
   👍 and it seems just necessary if we proceed with CAS



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-24379) Support AWS Glue Schema Registry Avro for Table API

2023-03-03 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer updated FLINK-24379:
--
Fix Version/s: aws-connector-4.1.0
   (was: 1.17.0)

> Support AWS Glue Schema Registry Avro for Table API
> ---
>
> Key: FLINK-24379
> URL: https://issues.apache.org/jira/browse/FLINK-24379
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Affects Versions: 1.14.0
>Reporter: Brad Davis
>Assignee: Karthi Thyagarajan
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: aws-connector-4.1.0
>
>
> Unlike most (all?) of the other Avro formats, the AWS Glue Schema Registry 
> version doesn't include a 
> META-INF/services/org.apache.flink.table.factories.Factory resource or a 
> class implementing 
> org.apache.flink.table.factories.DeserializationFormatFactory and 
> org.apache.flink.table.factories.SerializationFormatFactory.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-24379) Support AWS Glue Schema Registry Avro for Table API

2023-03-03 Thread Danny Cranmer (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer reassigned FLINK-24379:
-

Assignee: Karthi Thyagarajan  (was: Brad Davis)

> Support AWS Glue Schema Registry Avro for Table API
> ---
>
> Key: FLINK-24379
> URL: https://issues.apache.org/jira/browse/FLINK-24379
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / API
>Affects Versions: 1.14.0
>Reporter: Brad Davis
>Assignee: Karthi Thyagarajan
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.17.0
>
>
> Unlike most (all?) of the other Avro formats, the AWS Glue Schema Registry 
> version doesn't include a 
> META-INF/services/org.apache.flink.table.factories.Factory resource or a 
> class implementing 
> org.apache.flink.table.factories.DeserializationFormatFactory and 
> org.apache.flink.table.factories.SerializationFormatFactory.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31317) Introduce data structure for managing resource requirements of a job

2023-03-03 Thread Jira
David Morávek created FLINK-31317:
-

 Summary: Introduce data structure for managing resource 
requirements of a job
 Key: FLINK-31317
 URL: https://issues.apache.org/jira/browse/FLINK-31317
 Project: Flink
  Issue Type: Sub-task
Reporter: David Morávek






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31316) FLIP-291: Externalized Declarative Resource Management

2023-03-03 Thread Jira


 [ 
https://issues.apache.org/jira/browse/FLINK-31316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Morávek updated FLINK-31316:
--
Fix Version/s: 1.18.0

> FLIP-291: Externalized Declarative Resource Management
> --
>
> Key: FLINK-31316
> URL: https://issues.apache.org/jira/browse/FLINK-31316
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination, Runtime / REST
>Reporter: David Morávek
>Assignee: David Morávek
>Priority: Major
> Fix For: 1.18.0
>
>
> This is an umbrella ticket for 
> [FLIP-291|https://cwiki.apache.org/confluence/display/FLINK/FLIP-291%3A+Externalized+Declarative+Resource+Management].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-connector-opensearch] reta commented on pull request #5: [FLINK-30488] OpenSearch implementation of Async Sink

2023-03-03 Thread via GitHub


reta commented on PR #5:
URL: 
https://github.com/apache/flink-connector-opensearch/pull/5#issuecomment-1453764023

   Thanks a lot for review @dannycranmer 
   
   > @reta The PR looks good to me minus the Mockito comment. However I have 
questions over the approach here. We are adding a new sink alongside the 
existing sink, we will have `OpensearchSink` and `OpensearchAsyncSink`. How do 
the users know which one to pick? Why not replace the existing sink with the 
new implementation? 
   
   This is a indeed a good question, I think the main difference between those 
are within internal APIs the implementation is based upon:
- `OpensearchAsyncSink` uses `RestHighLevelClient::bulkdAsync` directly to 
dispatch the bulk requests
- `OpensearchSink` uses `BulkProcessor` and offers more flexibility with 
respect to failure handling and backoff policies (no straight equivalent in 
`RestHighLevelClient`)
   
   I have covered this part in the docs, thank you.
   
   > The [Jira](https://issues.apache.org/jira/browse/FLINK-31068) mentions 
docs, however there is no update here. Will you create a followup PR for that?
   > 
   
   Updated the documentation, thank you
   
   > If this has already been discussed on mailing lists I missed that, please 
give me a link :D
   
   You mean the `AsyncSync` implementation for OpenSearch? No, it was not 
discussed on mailing list but was mentioned on the initial pull request 
https://github.com/apache/flink/pull/18541#issuecomment-1026931087 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-31316) FLIP-291: Externalized Declarative Resource Management

2023-03-03 Thread Jira
David Morávek created FLINK-31316:
-

 Summary: FLIP-291: Externalized Declarative Resource Management
 Key: FLINK-31316
 URL: https://issues.apache.org/jira/browse/FLINK-31316
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Coordination, Runtime / REST
Reporter: David Morávek
Assignee: David Morávek


This is an umbrella ticket for 
[FLIP-291|https://cwiki.apache.org/confluence/display/FLINK/FLIP-291%3A+Externalized+Declarative+Resource+Management].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] dannycranmer closed pull request #17360: [FLINK-24379][Formats] Add support for Glue schema registry in Table API

2023-03-03 Thread via GitHub


dannycranmer closed pull request #17360: [FLINK-24379][Formats] Add support for 
Glue schema registry in Table API
URL: https://github.com/apache/flink/pull/17360


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] dannycranmer commented on pull request #17360: [FLINK-24379][Formats] Add support for Glue schema registry in Table API

2023-03-03 Thread via GitHub


dannycranmer commented on PR #17360:
URL: https://github.com/apache/flink/pull/17360#issuecomment-1453759473

   This connector has been moved to 
https://github.com/apache/flink-connector-aws/tree/main/flink-formats-aws/flink-avro-glue-schema-registry.
 Closing PR. Please reopen targeting flink-connector-aws


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-31315) FlinkActionsE2eTest.testMergeInto is unstable

2023-03-03 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-31315:


 Summary: FlinkActionsE2eTest.testMergeInto is unstable
 Key: FLINK-31315
 URL: https://issues.apache.org/jira/browse/FLINK-31315
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.4.0


{code:java}
Error:  Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 320.272 
s <<< FAILURE! - in org.apache.flink.table.store.tests.FlinkActionsE2eTest
82Error:  testMergeInto  Time elapsed: 111.826 s  <<< FAILURE!
83org.opentest4j.AssertionFailedError: 
84Result is still unexpected after 60 retries.
85Expected: {3, v_3, creation, 02-27=1, 2, v_2, creation, 02-27=1, 6, v_6, 
creation, 02-28=1, 1, v_1, creation, 02-27=1, 8, v_8, insert, 02-29=1, 11, 
v_11, insert, 02-29=1, 7, Seven, matched_upsert, 02-28=1, 5, v_5, creation, 
02-28=1, 10, v_10, creation, 02-28=1, 9, v_9, creation, 02-28=1}
86Actual: {4, v_4, creation, 02-27=1, 8, v_8, creation, 02-28=1, 3, v_3, 
creation, 02-27=1, 7, v_7, creation, 02-28=1, 2, v_2, creation, 02-27=1, 6, 
v_6, creation, 02-28=1, 1, v_1, creation, 02-27=1, 5, v_5, creation, 02-28=1, 
10, v_10, creation, 02-28=1, 9, v_9, creation, 02-28=1}
87  at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:39)
88  at org.junit.jupiter.api.Assertions.fail(Assertions.java:134)
89  at 
org.apache.flink.table.store.tests.E2eTestBase.checkResult(E2eTestBase.java:261)
90  at 
org.apache.flink.table.store.tests.FlinkActionsE2eTest.testMergeInto(FlinkActionsE2eTest.java:355)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31315) FlinkActionsE2eTest.testMergeInto is unstable

2023-03-03 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-31315:
-
Issue Type: Bug  (was: Improvement)

> FlinkActionsE2eTest.testMergeInto is unstable
> -
>
> Key: FLINK-31315
> URL: https://issues.apache.org/jira/browse/FLINK-31315
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.4.0
>
>
> {code:java}
> Error:  Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 320.272 s <<< FAILURE! - in 
> org.apache.flink.table.store.tests.FlinkActionsE2eTest
> 82Error:  testMergeInto  Time elapsed: 111.826 s  <<< FAILURE!
> 83org.opentest4j.AssertionFailedError: 
> 84Result is still unexpected after 60 retries.
> 85Expected: {3, v_3, creation, 02-27=1, 2, v_2, creation, 02-27=1, 6, v_6, 
> creation, 02-28=1, 1, v_1, creation, 02-27=1, 8, v_8, insert, 02-29=1, 11, 
> v_11, insert, 02-29=1, 7, Seven, matched_upsert, 02-28=1, 5, v_5, creation, 
> 02-28=1, 10, v_10, creation, 02-28=1, 9, v_9, creation, 02-28=1}
> 86Actual: {4, v_4, creation, 02-27=1, 8, v_8, creation, 02-28=1, 3, v_3, 
> creation, 02-27=1, 7, v_7, creation, 02-28=1, 2, v_2, creation, 02-27=1, 6, 
> v_6, creation, 02-28=1, 1, v_1, creation, 02-27=1, 5, v_5, creation, 02-28=1, 
> 10, v_10, creation, 02-28=1, 9, v_9, creation, 02-28=1}
> 87at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:39)
> 88at org.junit.jupiter.api.Assertions.fail(Assertions.java:134)
> 89at 
> org.apache.flink.table.store.tests.E2eTestBase.checkResult(E2eTestBase.java:261)
> 90at 
> org.apache.flink.table.store.tests.FlinkActionsE2eTest.testMergeInto(FlinkActionsE2eTest.java:355)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-31288) Disable overdraft buffer for batch shuffle

2023-03-03 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo closed FLINK-31288.
--
Fix Version/s: 1.17.0
   Resolution: Fixed

> Disable overdraft buffer for batch shuffle
> --
>
> Key: FLINK-31288
> URL: https://issues.apache.org/jira/browse/FLINK-31288
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.17.0, 1.16.1
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.17.0
>
>
> Only pipelined / pipelined-bounded partition needs overdraft buffer. More 
> specifically, there is no reason to request more buffers for non-pipelined 
> (i.e. batch) shuffle. The reasons are as follows:
>  # For BoundedBlockingShuffle, each full buffer will be directly released.
>  # For SortMergeShuffle, the maximum capacity of buffer pool is 4 * 
> numSubpartitions. It is efficient enough to spill this part of memory to disk.
>  # For Hybrid Shuffle, the buffer pool is unbounded. If it can't get a normal 
> buffer, it also can't get an overdraft buffer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31288) Disable overdraft buffer for batch shuffle

2023-03-03 Thread Weijie Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17695780#comment-17695780
 ] 

Weijie Guo edited comment on FLINK-31288 at 3/3/23 2:49 PM:


master(1.18) via 382148e1229901ab54503c8d9af6a18ea4c078dc.
release-1.17 via 7dd61c31714c1b07790982d21a486f5f803708df.
release-1.16 via 01c8eb59c1be92f1f8c1b81c66073eeb6009eb86.


was (Author: weijie guo):
master(1.18) via 382148e1229901ab54503c8d9af6a18ea4c078dc.
release-1.17 via 7dd61c31714c1b07790982d21a486f5f803708df.
release-1.16 waiting for CI green.

> Disable overdraft buffer for batch shuffle
> --
>
> Key: FLINK-31288
> URL: https://issues.apache.org/jira/browse/FLINK-31288
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.17.0, 1.16.1
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Blocker
>  Labels: pull-request-available
>
> Only pipelined / pipelined-bounded partition needs overdraft buffer. More 
> specifically, there is no reason to request more buffers for non-pipelined 
> (i.e. batch) shuffle. The reasons are as follows:
>  # For BoundedBlockingShuffle, each full buffer will be directly released.
>  # For SortMergeShuffle, the maximum capacity of buffer pool is 4 * 
> numSubpartitions. It is efficient enough to spill this part of memory to disk.
>  # For Hybrid Shuffle, the buffer pool is unbounded. If it can't get a normal 
> buffer, it also can't get an overdraft buffer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] reswqa merged pull request #22076: [BP-1.16][FLINK-31288][runtime] Disable overdraft buffer for non pipelined result partition.

2023-03-03 Thread via GitHub


reswqa merged PR #22076:
URL: https://github.com/apache/flink/pull/22076


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] akalash commented on a diff in pull request #21923: FLINK-13871: Consolidate volatile status fields in StreamTask

2023-03-03 Thread via GitHub


akalash commented on code in PR #21923:
URL: https://github.com/apache/flink/pull/21923#discussion_r1124556907


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java:
##
@@ -733,8 +764,7 @@ void restoreInternal() throws Exception {
 // needed
 channelIOExecutor.shutdown();
 
-isRunning = true;
-isRestoring = false;
+taskState.status = TaskState.Status.RUNNING;

Review Comment:
   As I understand right now all transitions are correct. The only problem is 
we can overwrite the final status with another final status which is incorrect 
behaviour. To fix that, I agree we can use CAS to update TaskStateStatus and 
allow only transitions described in javadoc.
   
   I also don't think that we should get back to 4 flags since we already spend 
a lot of time understanding which combinations of them are really possible and 
it seems nobody knows for sure. So it is better to decrease these combinations 
and I think the enum with clear transitions is good choice.
   
   I still think that we should even get rid of `failing` flag since I feel we 
can combine it with enum but I'm ok if we do it later



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-26808) [flink v1.14.2] Submit jobs via REST API not working after set web.submit.enable: false

2023-03-03 Thread Tobias Hofer (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696167#comment-17696167
 ] 

Tobias Hofer commented on FLINK-26808:
--

The expectation is that Flink behaves as documented.

In 
[https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/] 
I can read the following:
{quote}{{{}web.submit.enable{}}}: Enables uploading and starting jobs through 
the Flink UI {_}(true by default){_}. Please note that even when this is 
disabled, session clusters still accept jobs through REST requests (HTTP 
calls). This flag only guards the feature to upload jobs in the UI.
{quote}
I would like to be able to disable submission via UI but still allow to job 
sumission to work.

This behavior impacts the Flink Kubernetes Operator. It fails with

{color:#174ea6}Warning | SESSIONJOBEXCEPTION | 
org.apache.flink.runtime.rest.util.RestClientException: [Not found: 
/v1/jars/upload]"{color}

> [flink v1.14.2] Submit jobs via REST API not working after set 
> web.submit.enable: false
> ---
>
> Key: FLINK-26808
> URL: https://issues.apache.org/jira/browse/FLINK-26808
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.14.2
>Reporter: Luís Costa
>Priority: Minor
>
> Greetings,
> I am using flink version 1.14.2 and after changing web.submit.enable to 
> false, job submission via REST API is no longer working. 
> The app that uses flink receives a 404 with "Not found: /jars/upload" 
> Looking into 
> [documentation|[https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/]]
>   saw that web.upload.dir is only used if  {{web.submit.enable}} is true, if 
> not it will be used JOB_MANAGER_WEB_TMPDIR_KEY
> Doing a curl to /jars it returns:
> {code:java}
> curl -X GET http://localhost:8081/jars
> HTTP/1.1 404 Not Found
> {"errors":["Unable to load requested file /jars."]} {code}
> Found this issue related to option web.submit.enable 
> https://issues.apache.org/jira/browse/FLINK-13799
> Could you please let me know if this is an issue that you are already aware?
> Thanks in advance
> Best regards,
> Luís Costa
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-connector-opensearch] reta commented on a diff in pull request #11: [FLINK-30998] Add optional exception handler to flink-connector-opensearch

2023-03-03 Thread via GitHub


reta commented on code in PR #11:
URL: 
https://github.com/apache/flink-connector-opensearch/pull/11#discussion_r1124477439


##
flink-connector-opensearch/src/main/java/org/apache/flink/connector/opensearch/sink/OpensearchWriter.java:
##
@@ -122,10 +128,11 @@
 } catch (Exception e) {
 throw new FlinkRuntimeException("Failed to open the 
OpensearchEmitter", e);
 }
+this.failureHandler = failureHandler;
 }
 
 @Override
-public void write(IN element, Context context) throws IOException, 
InterruptedException {
+public void write(IN element, Context context) throws InterruptedException 
{

Review Comment:
   @lilyevsky could you please address this and 
[this](https://github.com/apache/flink-connector-opensearch/pull/11/files#r1110066551)
 comment? thank you



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-31239) Fix sum function can't get the corrected value when the argument type is string

2023-03-03 Thread Godfrey He (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Godfrey He closed FLINK-31239.
--
Fix Version/s: 1.18.0
   Resolution: Fixed

Fixed in

1.18.0: 
263555c9adcca0abe194e9a6c1d85ec591c304e4..62a3b99d23229b39c798a0b657cb11218a5bc940

1.17.0: 
3bdb50513ddbbf6c67560a078da3f9506e5cd611..ac2eb5b977de47fc5550d2ee9f30fff4dcaca2b6

> Fix sum function can't get the corrected value when the argument type is 
> string
> ---
>
> Key: FLINK-31239
> URL: https://issues.apache.org/jira/browse/FLINK-31239
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Affects Versions: 1.17.0
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0, 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30978) ExecutorImplITCase.testInterruptExecution hangs waiting for SQL gateway service closing

2023-03-03 Thread Shengkai Fang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696139#comment-17696139
 ] 

Shengkai Fang commented on FLINK-30978:
---

Merged into master: d96bb2f66d71fecdc5dba183ad04c9ba75e40845

> ExecutorImplITCase.testInterruptExecution hangs waiting for SQL gateway 
> service closing
> ---
>
> Key: FLINK-30978
> URL: https://issues.apache.org/jira/browse/FLINK-30978
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Gateway
>Affects Versions: 1.17.0
>Reporter: Qingsheng Ren
>Assignee: Shengkai Fang
>Priority: Major
>  Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45921&view=logs&j=0c940707-2659-5648-cbe6-a1ad63045f0a&t=075c2716-8010-5565-fe08-3c4bb45824a4&l=44674



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] godfreyhe closed pull request #22031: [FLINK-31239][hive] Fix native sum function can't get the corrected value when the argument type is string

2023-03-03 Thread via GitHub


godfreyhe closed pull request #22031: [FLINK-31239][hive] Fix native sum 
function can't get the corrected value when the argument type is string
URL: https://github.com/apache/flink/pull/22031


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30978) ExecutorImplITCase.testInterruptExecution hangs waiting for SQL gateway service closing

2023-03-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-30978:
---
Labels: pull-request-available test-stability  (was: test-stability)

> ExecutorImplITCase.testInterruptExecution hangs waiting for SQL gateway 
> service closing
> ---
>
> Key: FLINK-30978
> URL: https://issues.apache.org/jira/browse/FLINK-30978
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Gateway
>Affects Versions: 1.17.0
>Reporter: Qingsheng Ren
>Assignee: Shengkai Fang
>Priority: Major
>  Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45921&view=logs&j=0c940707-2659-5648-cbe6-a1ad63045f0a&t=075c2716-8010-5565-fe08-3c4bb45824a4&l=44674



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] fsk119 merged pull request #22055: [FLINK-30978][sql-client] Fix ExecutorImpl#testInterruptException hangs

2023-03-03 Thread via GitHub


fsk119 merged PR #22055:
URL: https://github.com/apache/flink/pull/22055


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-31313) Unsupported meta columns in column list of insert statement

2023-03-03 Thread lincoln lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-31313.
---
Resolution: Duplicate

> Unsupported meta columns in column list of insert statement
> ---
>
> Key: FLINK-31313
> URL: https://issues.apache.org/jira/browse/FLINK-31313
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.17.0, 1.16.1
>Reporter: lincoln lee
>Priority: Major
>
> Currently an error will be raised when ref meta columns in column list of 
> insert statement, e.g.,
> {code}
> INSERT INTO sink (a,b,f) -- here `f` is a metadata column of sink table
> SELECT ...{code}
> {code}
> Caused by: org.apache.calcite.runtime.CalciteContextException: At line 1, 
> column 44: Unknown target column 'f'
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:505)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:932)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:917)
>   at 
> org.apache.flink.table.planner.calcite.PreValidateReWriter$.newValidationError(PreValidateReWriter.scala:440)
>   at 
> org.apache.flink.table.planner.calcite.PreValidateReWriter$.validateField(PreValidateReWriter.scala:428)
>   at 
> org.apache.flink.table.planner.calcite.PreValidateReWriter$.$anonfun$appendPartitionAndNullsProjects$3(PreValidateReWriter.scala:169)
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
>   at scala.collection.Iterator.foreach(Iterator.scala:937)
>   at scala.collection.Iterator.foreach$(Iterator.scala:937)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1425)
>   at scala.collection.IterableLike.foreach(IterableLike.scala:70)
>   at scala.collection.IterableLike.foreach$(IterableLike.scala:69)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:233)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:226)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> org.apache.flink.table.planner.calcite.PreValidateReWriter$.appendPartitionAndNullsProjects(PreValidateReWriter.scala:161)
>   at 
> org.apache.flink.table.planner.calcite.PreValidateReWriter.rewriteInsert(PreValidateReWriter.scala:72)
> {code}
> The cause is current PreValidateReWriter in validation phase uses the 
> physical types of sink table which does not include metadata columns



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-31185) Python BroadcastProcessFunction not support side output

2023-03-03 Thread Dian Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu closed FLINK-31185.
---
  Assignee: Juntao Hu
Resolution: Fixed

Fixed in:
- master: 8d52415a05bdc67eb67a59bbc2e53f48762da374
- 1.17: 7040af5b7933905798ff6af0b35ac364b5fbe432
- 1.16: 8713b176abc5c9f5267d7559ace0b6bd8afc6d3f

> Python BroadcastProcessFunction not support side output
> ---
>
> Key: FLINK-31185
> URL: https://issues.apache.org/jira/browse/FLINK-31185
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.16.1
>Reporter: Juntao Hu
>Assignee: Juntao Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.17.0, 1.16.2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31313) Unsupported meta columns in column list of insert statement

2023-03-03 Thread lincoln lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696129#comment-17696129
 ] 

lincoln lee commented on FLINK-31313:
-

[~csq] Thanks for reminding, I've marked this duplicated. Also I left some 
comments on your pr.

> Unsupported meta columns in column list of insert statement
> ---
>
> Key: FLINK-31313
> URL: https://issues.apache.org/jira/browse/FLINK-31313
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.17.0, 1.16.1
>Reporter: lincoln lee
>Priority: Major
>
> Currently an error will be raised when ref meta columns in column list of 
> insert statement, e.g.,
> {code}
> INSERT INTO sink (a,b,f) -- here `f` is a metadata column of sink table
> SELECT ...{code}
> {code}
> Caused by: org.apache.calcite.runtime.CalciteContextException: At line 1, 
> column 44: Unknown target column 'f'
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:505)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:932)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:917)
>   at 
> org.apache.flink.table.planner.calcite.PreValidateReWriter$.newValidationError(PreValidateReWriter.scala:440)
>   at 
> org.apache.flink.table.planner.calcite.PreValidateReWriter$.validateField(PreValidateReWriter.scala:428)
>   at 
> org.apache.flink.table.planner.calcite.PreValidateReWriter$.$anonfun$appendPartitionAndNullsProjects$3(PreValidateReWriter.scala:169)
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
>   at scala.collection.Iterator.foreach(Iterator.scala:937)
>   at scala.collection.Iterator.foreach$(Iterator.scala:937)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1425)
>   at scala.collection.IterableLike.foreach(IterableLike.scala:70)
>   at scala.collection.IterableLike.foreach$(IterableLike.scala:69)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:233)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:226)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>   at 
> org.apache.flink.table.planner.calcite.PreValidateReWriter$.appendPartitionAndNullsProjects(PreValidateReWriter.scala:161)
>   at 
> org.apache.flink.table.planner.calcite.PreValidateReWriter.rewriteInsert(PreValidateReWriter.scala:72)
> {code}
> The cause is current PreValidateReWriter in validation phase uses the 
> physical types of sink table which does not include metadata columns



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] dianfu closed pull request #22003: [FLINK-31185][Python] Support side-output in broadcast processing

2023-03-03 Thread via GitHub


dianfu closed pull request #22003: [FLINK-31185][Python] Support side-output in 
broadcast processing
URL: https://github.com/apache/flink/pull/22003


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] lincoln-lil commented on a diff in pull request #21897: [FLINK-30922][table-planner] Apply persisted columns when doing appendPartitionAndNullsProjects

2023-03-03 Thread via GitHub


lincoln-lil commented on code in PR #21897:
URL: https://github.com/apache/flink/pull/21897#discussion_r1124356820


##
flink-table/flink-table-common/src/main/java/org/apache/flink/table/utils/TableSchemaUtils.java:
##
@@ -68,6 +68,36 @@ public static TableSchema getPhysicalSchema(TableSchema 
tableSchema) {
 return builder.build();
 }
 
+/**
+ * Return {@link TableSchema} which consists of all persisted columns. 
That means, the virtual
+ * computed columns and metadata columns are filterd out.

Review Comment:
   nit: -> filtered



##
flink-table/flink-table-common/src/main/java/org/apache/flink/table/utils/TableSchemaUtils.java:
##
@@ -68,6 +68,36 @@ public static TableSchema getPhysicalSchema(TableSchema 
tableSchema) {
 return builder.build();
 }
 
+/**
+ * Return {@link TableSchema} which consists of all persisted columns. 
That means, the virtual
+ * computed columns and metadata columns are filterd out.
+ *
+ * Readers(or writers) such as {@link TableSource} and {@link 
TableSink} should use this
+ * persisted schema to generate {@link TableSource#getProducedDataType()} 
and {@link

Review Comment:
   Users should know the difference between this method and  `getPhysicalSchema`



##
flink-table/flink-table-common/src/main/java/org/apache/flink/table/utils/TableSchemaUtils.java:
##
@@ -68,6 +68,36 @@ public static TableSchema getPhysicalSchema(TableSchema 
tableSchema) {
 return builder.build();
 }
 
+/**
+ * Return {@link TableSchema} which consists of all persisted columns. 
That means, the virtual
+ * computed columns and metadata columns are filterd out.
+ *
+ * Readers(or writers) such as {@link TableSource} and {@link 
TableSink} should use this
+ * persisted schema to generate {@link TableSource#getProducedDataType()} 
and {@link
+ * TableSource#getTableSchema()} rather than using the raw TableSchema 
which may contains
+ * additional columns.
+ */
+public static TableSchema getPersistedSchema(TableSchema tableSchema) {

Review Comment:
   we'd better extract a common method for this and `getPhysicalSchema` since 
only one condition differs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-31154) Build Release Candidate: 1.17.0-rc1

2023-03-03 Thread Qingsheng Ren (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qingsheng Ren reassigned FLINK-31154:
-

Assignee: Qingsheng Ren

> Build Release Candidate: 1.17.0-rc1
> ---
>
> Key: FLINK-31154
> URL: https://issues.apache.org/jira/browse/FLINK-31154
> Project: Flink
>  Issue Type: New Feature
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Qingsheng Ren
>Priority: Major
>
> The core of the release process is the build-vote-fix cycle. Each cycle 
> produces one release candidate. The Release Manager repeats this cycle until 
> the community approves one release candidate, which is then finalized.
> h4. Prerequisites
> Set up a few environment variables to simplify Maven commands that follow. 
> This identifies the release candidate being built. Start with {{RC_NUM}} 
> equal to 1 and increment it for each candidate:
> {code}
> RC_NUM="1"
> TAG="release-${RELEASE_VERSION}-rc${RC_NUM}"
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31249) Checkpoint timeout mechanism fails when finalizeCheckpoint is stuck

2023-03-03 Thread Renxiang Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696116#comment-17696116
 ] 

Renxiang Zhou commented on FLINK-31249:
---

[~roman] If it takes too long to finalize the checkpoint metadata, it usually 
means that there is a problem with the external storage service (in HDFS, it 
could happen when writing to a slow DataNode). In this case, we can retry 
writing a new metadata to DFS or just discard this checkpoint and make another 
one, rather than leaving the checkpoint stuck. What do you think of it ?

> Checkpoint timeout mechanism fails when finalizeCheckpoint is stuck
> ---
>
> Key: FLINK-31249
> URL: https://issues.apache.org/jira/browse/FLINK-31249
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.11.6, 1.16.0
>Reporter: Renxiang Zhou
>Priority: Major
> Fix For: 1.18.0
>
> Attachments: image-2023-02-28-11-25-03-637.png, 
> image-2023-02-28-12-04-35-178.png, image-2023-02-28-12-17-19-607.png
>
>
> When jobmanager receives all ACKs of tasks, it will finalize the pending 
> checkpoint to a completed checkpoint. Currently JM finalizes the pending 
> checkpoint with holding the checkpoint coordinator lock.
> When a DFS failure occurs, the {{jobmanager-future}} thread may be blocked at 
> finalizing the pending checkpoint.
> !image-2023-02-28-12-17-19-607.png|width=1010,height=244!
> And then the next checkpoint is triggered, the {{Checkpoint Timer}} thread 
> waits for the lock to be released. 
> !image-2023-02-28-11-25-03-637.png|width=1144,height=248!
> If the previous checkpoint times out, the {{Checkpoint Timer}} will not 
> execute the timeout event since it is blocked at waiting for the lock. As a 
> result, the previous checkpoint cannot be cancelled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-31294) CommitterOperator forgot to close Committer when closing.

2023-03-03 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-31294.

Fix Version/s: table-store-0.4.0
   Resolution: Fixed

master: 2e053e445be99dc0e7fc445728c381bbb8e7af37

> CommitterOperator forgot to close Committer when closing.
> -
>
> Key: FLINK-31294
> URL: https://issues.apache.org/jira/browse/FLINK-31294
> Project: Flink
>  Issue Type: Bug
>  Components: Table Store
>Reporter: Ming Li
>Assignee: Ming Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.4.0
>
>
> {{CommitterOperator}} does not close the {{Committer}} when it closes, which 
> may lead to resource leaks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   >