date:20230728

[jira] [Commented] (FLINK-32700) Support job drain for Savepoint upgrade mode jobs in Flink Operator

2023-07-28 Thread Gyula Fora (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748784#comment-17748784
 ] 

Gyula Fora commented on FLINK-32700:


Here it is:) https://issues.apache.org/jira/browse/FLINK-30529

> Support job drain for Savepoint upgrade mode jobs in Flink Operator
> ---
>
> Key: FLINK-32700
> URL: https://issues.apache.org/jira/browse/FLINK-32700
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.5.0
>Reporter: Manan Mangal
>Assignee: Manan Mangal
>Priority: Major
>
> During cancel job with savepoint upgrade mode, jobs can be allowed to drain 
> by advancing the watermark to the end, before they are stopped, so that the 
> in-flight data is not lost. 
> If the job fails to drain and hits timeout or any other error, it can be 
> cancelled without taking a savepoint.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32700) Support job drain for Savepoint upgrade mode jobs in Flink Operator

2023-07-28 Thread Gyula Fora (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748783#comment-17748783
 ] 

Gyula Fora commented on FLINK-32700:


Don’t get me wrong I think adding support for draining makes perfect sense and 
we should start with that:)

In a separate ticket we can discuss the savepoint timeout behavior . It sounds 
a little strange that you want to drain pipelines to avoid losing data but you 
would be ok with losing data on a savepoint timeout . I think if the job 
started to fail we should simply cancel/delete I agree there is a related 
ticket somewhere I will try to dig it out after I come back from vacation on 
Wednesday.

> Support job drain for Savepoint upgrade mode jobs in Flink Operator
> ---
>
> Key: FLINK-32700
> URL: https://issues.apache.org/jira/browse/FLINK-32700
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.5.0
>Reporter: Manan Mangal
>Assignee: Manan Mangal
>Priority: Major
>
> During cancel job with savepoint upgrade mode, jobs can be allowed to drain 
> by advancing the watermark to the end, before they are stopped, so that the 
> in-flight data is not lost. 
> If the job fails to drain and hits timeout or any other error, it can be 
> cancelled without taking a savepoint.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-connector-rabbitmq] RocMarshal commented on pull request #1: [FLINK-20628] RabbitMQ Connector using FLIP-27 Source API

2023-07-28 Thread via GitHub



RocMarshal commented on PR #1:
URL: 
https://github.com/apache/flink-connector-rabbitmq/pull/1#issuecomment-1656547551

   > > I am working for it now.
   > 
   > Any update to report from your end on this @RocMarshal ?
   
   Hi, @MartijnVisser Still in doing. I'll update in the end of this week.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] RocMarshal commented on pull request #20990: [FLINK-29050][JUnit5 Migration] Module: flink-hadoop-compatibility

2023-07-28 Thread via GitHub



RocMarshal commented on PR #20990:
URL: https://github.com/apache/flink/pull/20990#issuecomment-1656535485

   Hi, @1996fanrui Would you mind helping review this PR If you have free time ?
   Thanks you very much~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-32700) Support job drain for Savepoint upgrade mode jobs in Flink Operator

2023-07-28 Thread Talat Uyarer (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748759#comment-17748759
 ] 

Talat Uyarer edited comment on FLINK-32700 at 7/29/23 1:36 AM:
---

[~gyfora] Currently there is an issue on Operator's delete. We want to drain 
job because when we call
{code:java}
kubectl delete flinkdeployment{code}
Operator delete immediately for stateless/stateful jobs. So we lose in flight 
data. I believe by default Operator should delete jobs by emitting max 
Watermark. However emitting max watermark also has an issue, If the sink is in 
stuck state we can not delete the job and we created deadlock situation. 

Does not matter Flinkdeployment upgrade mode we use but if we use 
last-state/savepoint state we should drain in flight data to prevent 
unnecessary data duplication. We are not silently cancel the jobs. Actually we 
wait until savepoint/checkpoint timeout to when user delete their 
flinkdeployment. Current situation even Operator does not wait for timeout, 
delete immediately. 

We would like to follow your suggestion. But please keep in your mind We have 
20K+ stateful/stateless job those are triggered by programmatically. There is 
no way to change their deployment manually for us. 

 

cc [~mmangal] 


was (Author: talat):
[~gyfora] Currently there is an issue on Operator's delete. We want to drain 
job because when we call
{code:java}
kubectl delete flinkdeployment{code}
Operator delete immediately for stateless/stateful jobs. So we lose in flight 
data. I believe by default Operator should delete jobs by emitting max 
Watermark. How emitting mac watermark also has issue, If sink is in stuck state 
we can not delete the job and we created deadlock situation. 

Does not matter Flinkdeployment upgrade mode we use but if we use 
last-state/savepoint state we should drain in flight data to prevent 
unnecessary data duplication. We are not silently cancel the jobs. Actually we 
wait until savepoint/checkpoint timeout to when user delete their 
flinkdeployment. Current situation even Operator does not wait for timeout, 
delete immediately. 

We would like to follow your suggestion. But please keep in your mind We have 
20K+ stateful/stateless job those are triggered by programmatically. There is 
no way to change their deployment manually for us. 

 

cc [~mmangal] 

> Support job drain for Savepoint upgrade mode jobs in Flink Operator
> ---
>
> Key: FLINK-32700
> URL: https://issues.apache.org/jira/browse/FLINK-32700
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.5.0
>Reporter: Manan Mangal
>Assignee: Manan Mangal
>Priority: Major
>
> During cancel job with savepoint upgrade mode, jobs can be allowed to drain 
> by advancing the watermark to the end, before they are stopped, so that the 
> in-flight data is not lost. 
> If the job fails to drain and hits timeout or any other error, it can be 
> cancelled without taking a savepoint.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32700) Support job drain for Savepoint upgrade mode jobs in Flink Operator

2023-07-28 Thread Talat Uyarer (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748759#comment-17748759
 ] 

Talat Uyarer commented on FLINK-32700:
--

[~gyfora] Currently there is an issue on Operator's delete. We want to drain 
job because when we call
{code:java}
kubectl delete flinkdeployment{code}
Operator delete immediately for stateless/stateful jobs. So we lose in flight 
data. I believe by default Operator should delete jobs by emitting max 
Watermark. How emitting mac watermark also has issue, If sink is in stuck state 
we can not delete the job and we created deadlock situation. 

Does not matter Flinkdeployment upgrade mode we use but if we use 
last-state/savepoint state we should drain in flight data to prevent 
unnecessary data duplication. We are not silently cancel the jobs. Actually we 
wait until savepoint/checkpoint timeout to when user delete their 
flinkdeployment. Current situation even Operator does not wait for timeout, 
delete immediately. 

We would like to follow your suggestion. But please keep in your mind We have 
20K+ stateful/stateless job those are triggered by programmatically. There is 
no way to change their deployment manually for us. 

 

cc [~mmangal] 

> Support job drain for Savepoint upgrade mode jobs in Flink Operator
> ---
>
> Key: FLINK-32700
> URL: https://issues.apache.org/jira/browse/FLINK-32700
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-1.5.0
>Reporter: Manan Mangal
>Assignee: Manan Mangal
>Priority: Major
>
> During cancel job with savepoint upgrade mode, jobs can be allowed to drain 
> by advancing the watermark to the end, before they are stopped, so that the 
> in-flight data is not lost. 
> If the job fails to drain and hits timeout or any other error, it can be 
> cancelled without taking a savepoint.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] netvl commented on pull request #22854: [FLINK-32137] Added filtering of lambdas when building the flame graph

2023-07-28 Thread via GitHub



netvl commented on PR #22854:
URL: https://github.com/apache/flink/pull/22854#issuecomment-1656406321

   @1996fanrui sorry for the delay! I fixed the checkstyle issues (`mvn 
checkstyle:check` now passes locally), and I fixed the picture and the 
description of the PR. Please review, thanks!)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-20767) add nested field support for SupportsFilterPushDown

2023-07-28 Thread Venkata krishnan Sowrirajan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-20767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748687#comment-17748687
 ] 

Venkata krishnan Sowrirajan commented on FLINK-20767:
-

I am currently looking into this issue and for our use case internally we have 
lots of nested fields. Based on the experiments done with Spark and Flink, 
without nested fields filter push down it will be very slow.

> add nested field support for SupportsFilterPushDown
> ---
>
> Key: FLINK-20767
> URL: https://issues.apache.org/jira/browse/FLINK-20767
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 1.12.0
>Reporter: Jun Zhang
>Priority: Major
> Fix For: 1.18.0
>
>
> I think we should add the nested field support for SupportsFilterPushDown



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32710) The LeaderElection component IDs for running is only the JobID which might be confusing in the log output

2023-07-28 Thread Jiadong Lu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748644#comment-17748644
 ] 

Jiadong Lu commented on FLINK-32710:


Hi, Matthias. If i understand correctly , this ticket is supposed to add an id 
prefix that could represent the component, like zk-\{JobId} ? if so , I would 
like to fix it. please assign it to me.

> The LeaderElection component IDs for running is only the JobID which might be 
> confusing in the log output
> -
>
> Key: FLINK-32710
> URL: https://issues.apache.org/jira/browse/FLINK-32710
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Matthias Pohl
>Priority: Minor
>  Labels: starter
>
> I noticed that the leader log messages for the jobs use the plain job ID as 
> the component ID. That might be confusing when reading the logs since it's a 
> UUID with no additional context.
> We might want to add a prefix (e.g. {{job-}} to these component IDs.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32711) Type mismatch when proctime function used as parameter

2023-07-28 Thread Aitozi (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aitozi updated FLINK-32711:
---
Issue Type: Bug  (was: Improvement)

> Type mismatch when proctime function used as parameter
> --
>
> Key: FLINK-32711
> URL: https://issues.apache.org/jira/browse/FLINK-32711
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Aitozi
>Priority: Major
>
> reproduce case:
> {code:sql}
> SELECT TYPEOF(PROCTIME())
> {code}
> this query will fail with 
> org.apache.flink.table.planner.codegen.CodeGenException: Mismatch of 
> function's argument data type 'TIMESTAMP_LTZ(3) NOT NULL' and actual argument 
> type 'TIMESTAMP_LTZ(3)'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32711) Type mismatch when proctime function used as parameter

2023-07-28 Thread Aitozi (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748626#comment-17748626
 ] 

Aitozi commented on FLINK-32711:


It's caused by the hard coded nullable result type of PROCTIME_MATERIALIZE 
codegen.

> Type mismatch when proctime function used as parameter
> --
>
> Key: FLINK-32711
> URL: https://issues.apache.org/jira/browse/FLINK-32711
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Aitozi
>Priority: Major
>
> reproduce case:
> {code:sql}
> SELECT TYPEOF(PROCTIME())
> {code}
> this query will fail with 
> org.apache.flink.table.planner.codegen.CodeGenException: Mismatch of 
> function's argument data type 'TIMESTAMP_LTZ(3) NOT NULL' and actual argument 
> type 'TIMESTAMP_LTZ(3)'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-32711) Type mismatch when proctime function used as parameter

2023-07-28 Thread Aitozi (Jira)

Aitozi created FLINK-32711:
--

 Summary: Type mismatch when proctime function used as parameter
 Key: FLINK-32711
 URL: https://issues.apache.org/jira/browse/FLINK-32711
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.18.0
Reporter: Aitozi


reproduce case:

{code:sql}
SELECT TYPEOF(PROCTIME())
{code}

this query will fail with 

org.apache.flink.table.planner.codegen.CodeGenException: Mismatch of function's 
argument data type 'TIMESTAMP_LTZ(3) NOT NULL' and actual argument type 
'TIMESTAMP_LTZ(3)'.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32681) RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure unstablie

2023-07-28 Thread Matthias Pohl (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-32681:
--
Issue Type: Bug  (was: Technical Debt)

> RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure unstablie
> 
>
> Key: FLINK-32681
> URL: https://issues.apache.org/jira/browse/FLINK-32681
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends, Tests
>Affects Versions: 1.18.0
>Reporter: Chesnay Schepler
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51712=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=125e07e7-8de0-5c6c-a541-a567415af3ef
> Failed 3 times in yesterdays nightly run.
> {code}
> Jul 26 01:12:46 01:12:46.889 [ERROR] 
> org.apache.flink.contrib.streaming.state.RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure
>   Time elapsed: 0.044 s  <<< FAILURE!
> Jul 26 01:12:46 java.lang.AssertionError
> Jul 26 01:12:46   at org.junit.Assert.fail(Assert.java:87)
> Jul 26 01:12:46   at org.junit.Assert.assertTrue(Assert.java:42)
> Jul 26 01:12:46   at org.junit.Assert.assertFalse(Assert.java:65)
> Jul 26 01:12:46   at org.junit.Assert.assertFalse(Assert.java:75)
> Jul 26 01:12:46   at 
> org.apache.flink.contrib.streaming.state.RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure(RocksDBStateDownloaderTest.java:151)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32681) RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure unstablie

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748609#comment-17748609
 ] 

Matthias Pohl commented on FLINK-32681:
---

I'm linking FLINK-32345 as a cause because that failing test was added in that 
issue. [~srichter] can you take this one?

I raised the priority to Critical as well.

> RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure unstablie
> 
>
> Key: FLINK-32681
> URL: https://issues.apache.org/jira/browse/FLINK-32681
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / State Backends, Tests
>Affects Versions: 1.18.0
>Reporter: Chesnay Schepler
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51712=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=125e07e7-8de0-5c6c-a541-a567415af3ef
> Failed 3 times in yesterdays nightly run.
> {code}
> Jul 26 01:12:46 01:12:46.889 [ERROR] 
> org.apache.flink.contrib.streaming.state.RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure
>   Time elapsed: 0.044 s  <<< FAILURE!
> Jul 26 01:12:46 java.lang.AssertionError
> Jul 26 01:12:46   at org.junit.Assert.fail(Assert.java:87)
> Jul 26 01:12:46   at org.junit.Assert.assertTrue(Assert.java:42)
> Jul 26 01:12:46   at org.junit.Assert.assertFalse(Assert.java:65)
> Jul 26 01:12:46   at org.junit.Assert.assertFalse(Assert.java:75)
> Jul 26 01:12:46   at 
> org.apache.flink.contrib.streaming.state.RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure(RocksDBStateDownloaderTest.java:151)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32681) RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure unstablie

2023-07-28 Thread Matthias Pohl (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-32681:
--
Priority: Critical  (was: Major)

> RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure unstablie
> 
>
> Key: FLINK-32681
> URL: https://issues.apache.org/jira/browse/FLINK-32681
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / State Backends, Tests
>Affects Versions: 1.18.0
>Reporter: Chesnay Schepler
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51712=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=125e07e7-8de0-5c6c-a541-a567415af3ef
> Failed 3 times in yesterdays nightly run.
> {code}
> Jul 26 01:12:46 01:12:46.889 [ERROR] 
> org.apache.flink.contrib.streaming.state.RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure
>   Time elapsed: 0.044 s  <<< FAILURE!
> Jul 26 01:12:46 java.lang.AssertionError
> Jul 26 01:12:46   at org.junit.Assert.fail(Assert.java:87)
> Jul 26 01:12:46   at org.junit.Assert.assertTrue(Assert.java:42)
> Jul 26 01:12:46   at org.junit.Assert.assertFalse(Assert.java:65)
> Jul 26 01:12:46   at org.junit.Assert.assertFalse(Assert.java:75)
> Jul 26 01:12:46   at 
> org.apache.flink.contrib.streaming.state.RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure(RocksDBStateDownloaderTest.java:151)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32662) JobMasterTest.testRetrievingCheckpointStats fails with NPE on AZP

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748607#comment-17748607
 ] 

Matthias Pohl commented on FLINK-32662:
---

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51777=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7c1d86e3-35bd-5fd5-3b7c-30c126a78702=7282

> JobMasterTest.testRetrievingCheckpointStats fails with NPE on AZP
> -
>
> Key: FLINK-32662
> URL: https://issues.apache.org/jira/browse/FLINK-32662
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Assignee: Hong Liang Teoh
>Priority: Critical
>  Labels: test-stability
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51452=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7c1d86e3-35bd-5fd5-3b7c-30c126a78702=8654
> fails with NPE as
> {noformat}
> Jul 20 01:01:33 01:01:33.491 [ERROR] 
> org.apache.flink.runtime.jobmaster.JobMasterTest.testRetrievingCheckpointStats
>   Time elapsed: 0.036 s  <<< ERROR!
> Jul 20 01:01:33 java.lang.NullPointerException
> Jul 20 01:01:33   at 
> org.apache.flink.runtime.jobmaster.JobMasterTest.testRetrievingCheckpointStats(JobMasterTest.java:2132)
> Jul 20 01:01:33   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Jul 20 01:01:33   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Jul 20 01:01:33   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Jul 20 01:01:33   at java.lang.reflect.Method.invoke(Method.java:498)
> Jul 20 01:01:33   at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> ...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32681) RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure unstablie

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748606#comment-17748606
 ] 

Matthias Pohl commented on FLINK-32681:
---

twice in the same build:
* 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51777=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=24c3384f-1bcb-57b3-224f-51bf973bbee8=10286
* 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51777=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=125e07e7-8de0-5c6c-a541-a567415af3ef=10241

> RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure unstablie
> 
>
> Key: FLINK-32681
> URL: https://issues.apache.org/jira/browse/FLINK-32681
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / State Backends, Tests
>Affects Versions: 1.18.0
>Reporter: Chesnay Schepler
>Priority: Major
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51712=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=125e07e7-8de0-5c6c-a541-a567415af3ef
> Failed 3 times in yesterdays nightly run.
> {code}
> Jul 26 01:12:46 01:12:46.889 [ERROR] 
> org.apache.flink.contrib.streaming.state.RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure
>   Time elapsed: 0.044 s  <<< FAILURE!
> Jul 26 01:12:46 java.lang.AssertionError
> Jul 26 01:12:46   at org.junit.Assert.fail(Assert.java:87)
> Jul 26 01:12:46   at org.junit.Assert.assertTrue(Assert.java:42)
> Jul 26 01:12:46   at org.junit.Assert.assertFalse(Assert.java:65)
> Jul 26 01:12:46   at org.junit.Assert.assertFalse(Assert.java:75)
> Jul 26 01:12:46   at 
> org.apache.flink.contrib.streaming.state.RocksDBStateDownloaderTest.testMultiThreadCleanupOnFailure(RocksDBStateDownloaderTest.java:151)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] luoyuxia commented on a diff in pull request #23013: [FLINK-32620][sink] Migrate DiscardingSink to sinkv2.

2023-07-28 Thread via GitHub



luoyuxia commented on code in PR #23013:
URL: https://github.com/apache/flink/pull/23013#discussion_r1277459386


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/sink/v2/DiscardingSink.java:
##
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.functions.sink.v2;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.SupportsConcurrentExecutionAttempts;
+import org.apache.flink.api.connector.sink2.Sink;
+import org.apache.flink.api.connector.sink2.SinkWriter;
+
+import java.io.IOException;
+
+/**
+ * A special sink that ignores all elements.
+ *
+ * @param  The type of elements received by the sink.
+ */
+@PublicEvolving

Review Comment:
   Just wondering is there any dicuss or vote for making it to `PublicEvolving`.



##
flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/table/stream/StreamingSink.java:
##
@@ -173,7 +173,7 @@ public static DataStreamSink sink(
 }
 
 DataStreamSink discardingSink =
-stream.addSink(new 
DiscardingSink<>()).name("end").setParallelism(1);
+stream.sinkTo(new 
DiscardingSink<>()).name("end").setParallelism(1);

Review Comment:
   Do you mean `json_execution_plan` or `execNode plan` described in[ FLIP-190 
](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336489#FLIP190:SupportVersionUpgradesforTableAPI
 )
   As long as we can keep `execNode plan` same, there won't break 
`compatibility`, And as i can see, the `execNode plan`  won't change by this 
pr, so I don't think it'll break 
`compatibility`



##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/translators/SinkTransformationTranslator.java:
##
@@ -315,6 +315,13 @@ private  R adjustTransformations(
 Transformation::getDescription,
 Transformation::setDescription);
 
+// handle coLocationGroupKey.

Review Comment:
   Why change it in this PR?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] Tartarus0zm commented on pull request #23060: [FLINK-32519][docs] Add doc for [CREATE OR] REPLACE TABLE AS statement

2023-07-28 Thread via GitHub



Tartarus0zm commented on PR #23060:
URL: https://github.com/apache/flink/pull/23060#issuecomment-1655600364

   @luoyuxia  thanks for your review! PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748584#comment-17748584
 ] 

Matthias Pohl commented on FLINK-31168:
---

(y) I was able to reproduce this test failure using JDK17 (after 3 runs and 
after 74 runs). I'm gonna go ahead and investigate it further.

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056)
>   ... 4 more
> Caused by: java.lang.RuntimeException: Error while waiting for job to be 
> initialized
>   at 
> org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160)
>   at 
> org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>   at 
> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
>   at 
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
>   at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
>   at 
> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: 
> [org.apache.flink.runtime.rest.NotFoundException: Job 
> 865dcd87f4828dbeb3d93eb52e2636b1 not found
>   at 
> org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$1(AbstractExecutionGraphHandler.java:99)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>   at 
> org.apache.flink.runtime.rest.handler.legacy.DefaultExecutionGraphCache.lambda$getExecutionGraphInternal$0(DefaultExecutionGraphCache.java:109)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
>

[GitHub] [flink] luoyuxia commented on a diff in pull request #23060: [FLINK-32519][docs] Add doc for [CREATE OR] REPLACE TABLE AS statement

2023-07-28 Thread via GitHub



luoyuxia commented on code in PR #23060:
URL: https://github.com/apache/flink/pull/23060#discussion_r1277438779


##
docs/content/docs/dev/table/sql/create.md:
##
@@ -557,6 +558,59 @@ INSERT INTO my_ctas_table SELECT id, name, age FROM 
source_table WHERE mod(id, 1
 
 {{< top >}}
 
+## [CREATE OR] REPLACE TABLE
+```sql
+[CREATE OR] REPLACE TABLE [catalog_name.][db_name.]table_name
+[COMMENT table_comment]
+WITH (key1=val1, key2=val2, ...)
+AS select_query
+```
+
+**Note** RTAS has the following semantic:
+* REPLACE TABLE AS SELECT statement: the target table to be replaced must 
exist, otherwise, an exception will be thrown.
+* CREATE OR REPLACE TABLE AS SELECT statement: the target table to be replaced 
will be created if it does not exist; if it does exist, it'll be replaced.
+
+Tables can also be replaced(or created) and populated by the results of a 
query in one [CREATE OR] REPLACE TABLE(RTAS) statement.  RTAS is the simplest 
and fastest way to replace and insert data into a table with a single command.

Review Comment:
   ```suggestion
   Tables can also be replaced(or created) and populated by the results of a 
query in one [CREATE OR] REPLACE TABLE AS SELECT(RTAS) statement. RTAS is the 
simplest and fastest way to replace(or create) and insert data into a table 
with a single command.
   ```



##
docs/content/docs/dev/table/sql/create.md:
##
@@ -557,6 +558,59 @@ INSERT INTO my_ctas_table SELECT id, name, age FROM 
source_table WHERE mod(id, 1
 
 {{< top >}}
 
+## [CREATE OR] REPLACE TABLE
+```sql
+[CREATE OR] REPLACE TABLE [catalog_name.][db_name.]table_name
+[COMMENT table_comment]
+WITH (key1=val1, key2=val2, ...)
+AS select_query
+```
+
+**Note** RTAS has the following semantic:
+* REPLACE TABLE AS SELECT statement: the target table to be replaced must 
exist, otherwise, an exception will be thrown.
+* CREATE OR REPLACE TABLE AS SELECT statement: the target table to be replaced 
will be created if it does not exist; if it does exist, it'll be replaced.
+
+Tables can also be replaced(or created) and populated by the results of a 
query in one [CREATE OR] REPLACE TABLE(RTAS) statement.  RTAS is the simplest 
and fastest way to replace and insert data into a table with a single command.

Review Comment:
   nit: I'd like to remove `also`, for there's no context in here. It's not 
like ctas part.'
   ```suggestion
   Tables can be replaced(or created) and populated by the results of a query 
in one [CREATE OR] REPLACE TABLE(RTAS) statement.  RTAS is the simplest and 
fastest way to replace and insert data into a table with a single command.
   ```



##
docs/content.zh/docs/dev/table/sql/create.md:
##
@@ -557,6 +558,58 @@ INSERT INTO my_ctas_table SELECT id, name, age FROM 
source_table WHERE mod(id, 1
 
 {{< top >}}
 
+## [CREATE OR] REPLACE TABLE
+```sql
+[CREATE OR] REPLACE TABLE [catalog_name.][db_name.]table_name
+[COMMENT table_comment]
+WITH (key1=val1, key2=val2, ...)
+AS select_query
+```
+
+**注意** RTAS 有如下语义:
+* REPLACE TABLE AS SELECT 语句：要被替换的目标表必须存在，否则会报错。
+* CREATE OR REPLACE TABLE AS SELECT 语句：要被替换的目标表如果不存在，引擎会自动创建；如果存在的话，引擎就直接替换它。
+
+表也可以通过一个 [CREATE OR] REPLACE TABLE(RTAS) 语句中的查询结果来替换(或创建)并填充数据，RTAS 
是一种简单快捷的替换表并插入数据的方法。

Review Comment:
   ```suggestion
   表可以通过一个 [CREATE OR] REPLACE TABLE（RTAS）语句中的查询结果来替换（或创建）并填充数据，RTAS 
是一种简单快捷的替换表并插入数据的方法。
   ```



##
docs/content.zh/docs/dev/table/sql/create.md:
##
@@ -557,6 +558,58 @@ INSERT INTO my_ctas_table SELECT id, name, age FROM 
source_table WHERE mod(id, 1
 
 {{< top >}}
 
+## [CREATE OR] REPLACE TABLE
+```sql
+[CREATE OR] REPLACE TABLE [catalog_name.][db_name.]table_name
+[COMMENT table_comment]
+WITH (key1=val1, key2=val2, ...)
+AS select_query
+```
+
+**注意** RTAS 有如下语义:
+* REPLACE TABLE AS SELECT 语句：要被替换的目标表必须存在，否则会报错。
+* CREATE OR REPLACE TABLE AS SELECT 语句：要被替换的目标表如果不存在，引擎会自动创建；如果存在的话，引擎就直接替换它。
+
+表也可以通过一个 [CREATE OR] REPLACE TABLE(RTAS) 语句中的查询结果来替换(或创建)并填充数据，RTAS 
是一种简单快捷的替换表并插入数据的方法。

Review Comment:
   ```suggestion
   表可以通过一个 [CREATE OR] REPLACE TABLE（RTAS）语句中的查询结果来替换（或创建）并填充数据，RTAS 
是一种简单快捷的替换表并插入数据的方法。
   ```



##
docs/content.zh/docs/dev/table/sql/create.md:
##
@@ -557,6 +558,58 @@ INSERT INTO my_ctas_table SELECT id, name, age FROM 
source_table WHERE mod(id, 1
 
 {{< top >}}
 
+## [CREATE OR] REPLACE TABLE
+```sql
+[CREATE OR] REPLACE TABLE [catalog_name.][db_name.]table_name
+[COMMENT table_comment]
+WITH (key1=val1, key2=val2, ...)
+AS select_query
+```
+
+**注意** RTAS 有如下语义:
+* REPLACE TABLE AS SELECT 语句：要被替换的目标表必须存在，否则会报错。
+* CREATE OR REPLACE TABLE AS SELECT 语句：要被替换的目标表如果不存在，引擎会自动创建；如果存在的话，引擎就直接替换它。
+
+表也可以通过一个 [CREATE OR] REPLACE TABLE(RTAS) 语句中的查询结果来替换(或创建)并填充数据，RTAS 
是一种简单快捷的替换表并插入数据的方法。

Review Comment:
   ```suggestion
   表也可以通过一个 [CREATE OR] REPLACE TABLE（RTAS）语句中的查询结果来替换（或创建）并填充数据，RTAS 
是一种简单快捷的替换（或创建）表并插入数据的方法。
   ```



##
docs/content.zh/docs/dev/table/sql/create.md:
##
@@ -557,6 +558,58 @@ INSERT INTO

[jira] [Closed] (FLINK-32691) Make it possible to use builtin functions without catalog/db set

2023-07-28 Thread Dawid Wysakowicz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz closed FLINK-32691.

Resolution: Fixed

Fixed in 3f63e03e83144e9857834f8db1895637d2aa218a

> Make it possible to use builtin functions without catalog/db set
> 
>
> Key: FLINK-32691
> URL: https://issues.apache.org/jira/browse/FLINK-32691
> Project: Flink
>  Issue Type: Bug
>Reporter: Jim Hughes
>Assignee: Jim Hughes
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> Relative to https://issues.apache.org/jira/browse/FLINK-32584, function 
> lookup fails without the catalog and database set.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32650) Added the ability to split flink-protobuf codegen code

2023-07-28 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-32650:
---
Fix Version/s: (was: 1.18.0)

> Added the ability to split flink-protobuf codegen code
> --
>
> Key: FLINK-32650
> URL: https://issues.apache.org/jira/browse/FLINK-32650
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.18.0
>Reporter: 李精卫
>Priority: Major
>
> h3. backgroud
> Flink serializes and deserializes protobuf format data by calling the decode 
> or encode method in GeneratedProtoToRow_XXX.java generated by codegen to 
> parse byte[] data into protobuf java objects. The size of the decode/encode 
> codegen method body is strongly related to the number of defined fields in 
> protobuf. When the number of fields exceeds a certain threshold and the 
> compiled method body exceeds 8k, the decode/encode method will not be 
> optimized by JIT, seriously affecting serialization or deserialization 
> performance. Even if the compiled method body exceeds 64k, it will directly 
> cause the task to fail to start.
> h3. solution
> So I proposed Codegen Splitter for protobuf parsing to split the 
> encode/decode method to solve this problem.
> The specific idea is as follows. In the current decode/encode method, each 
> field defined for the protobuf message is placed in the method body. In fact, 
> there are no shared parameters between the fields, so multiple fields can be 
> merged and parsed and written into the split method body. If the number of 
> strings in the current method body exceeds the threshold, a split method will 
> be generated, these fields will be parsed in the split method, and the split 
> method will be called in the decode/encode method. By analogy, the 
> decode/encode method including the split method is finally generated.
> after spilt code example
>  
> {code:java}
> //代码占位符
> public static RowData 
> decode(org.apache.flink.formats.protobuf.testproto.AdProfile.AdProfilePb 
> message){
> RowData rowData=null;
> org.apache.flink.formats.protobuf.testproto.AdProfile.AdProfilePb message1242 
> = message;
> GenericRowData rowData1242 = new GenericRowData(5);
> split2585(rowData1242, message1242);
> rowData = rowData1242;return rowData;
> }
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-connector-rabbitmq] MartijnVisser commented on pull request #1: [FLINK-20628] RabbitMQ Connector using FLIP-27 Source API

2023-07-28 Thread via GitHub



MartijnVisser commented on PR #1:
URL: 
https://github.com/apache/flink-connector-rabbitmq/pull/1#issuecomment-1655509560

   > I am working for it now.
   
   Any update to report from your end on this @RocMarshal ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-32672) Migrate RabbitMQ connector to Source V2 API

2023-07-28 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748561#comment-17748561
 ] 

Martijn Visser commented on FLINK-32672:


There is already FLINK-20628 and a PR 
https://github.com/apache/flink-connector-rabbitmq/pull/1 - it just appears 
that the committer isn't responding to the review comments left by [~chesnay]

It's kind of a similar problem to the Google Pubsub connector: we need 
maintainers, else I don't see how we can keep this connector alive. 

> Migrate RabbitMQ connector to Source V2 API
> ---
>
> Key: FLINK-32672
> URL: https://issues.apache.org/jira/browse/FLINK-32672
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors/ RabbitMQ
>Reporter: Alexander Fedulov
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32673) Migrage Google PubSub connector to V2

2023-07-28 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748559#comment-17748559
 ] 

Martijn Visser commented on FLINK-32673:


There's two things imho:

1. I can restart the discussion I originally opened on this at 
https://lists.apache.org/thread/c1stxqmwdzxmpjp6pj2pxsvmpbtwhqnf to see if 
someone comes forward to help out
2. If no-one comes forward, then I would be inclined to open a vote to remove 
the connector from Flink given a lack of maintainers

> Migrage Google PubSub connector to V2
> -
>
> Key: FLINK-32673
> URL: https://issues.apache.org/jira/browse/FLINK-32673
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Alexander Fedulov
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32670) Cascade deprecation to classes that implement SourceFunction

2023-07-28 Thread Alexander Fedulov (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Fedulov updated FLINK-32670:
--
Description: (was:  ParallelSourceFunction, RichParallelSourceFunction, 
ExternallyInducedSource)

> Cascade deprecation to classes that implement SourceFunction
> 
>
> Key: FLINK-32670
> URL: https://issues.apache.org/jira/browse/FLINK-32670
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Alexander Fedulov
>Assignee: Alexander Fedulov
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32697) When using JDBC to insert data into the oracle database, an error will be reported if the target field is lowercase,

2023-07-28 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748557#comment-17748557
 ] 

Martijn Visser commented on FLINK-32697:


[~sunyanyong] Do you want to fix it?

> When using JDBC to insert data into the oracle database, an error will be 
> reported if the target field is lowercase,
> 
>
> Key: FLINK-32697
> URL: https://issues.apache.org/jira/browse/FLINK-32697
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Affects Versions: jdbc-3.1.1
>Reporter: sunyanyong
>Priority: Major
> Attachments: image-2023-07-27-10-39-31-884.png
>
>
> When using JDBC to insert data into the oracle database, if the target field 
> is lowercase, an error will be reported.
> An example input format is "insert into tableName(ID, `"name"`) values(1, 
> 'test')".
> The reason for the error is that there is a problem with the judgment logic 
> in the 
> FieldNamedPreparedStatementImpl.java.  The judgment logic will treat double 
> quotes as delimiters.
> !image-2023-07-27-10-39-31-884.png|width=876,height=540!
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32697) When using JDBC to insert data into the oracle database, an error will be reported if the target field is lowercase,

2023-07-28 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-32697:
---
Priority: Major  (was: Blocker)

> When using JDBC to insert data into the oracle database, an error will be 
> reported if the target field is lowercase,
> 
>
> Key: FLINK-32697
> URL: https://issues.apache.org/jira/browse/FLINK-32697
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / JDBC
>Affects Versions: jdbc-3.1.1
>Reporter: sunyanyong
>Priority: Major
> Attachments: image-2023-07-27-10-39-31-884.png
>
>
> When using JDBC to insert data into the oracle database, if the target field 
> is lowercase, an error will be reported.
> An example input format is "insert into tableName(ID, `"name"`) values(1, 
> 'test')".
> The reason for the error is that there is a problem with the judgment logic 
> in the 
> FieldNamedPreparedStatementImpl.java.  The judgment logic will treat double 
> quotes as delimiters.
> !image-2023-07-27-10-39-31-884.png|width=876,height=540!
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32670) Cascade deprecation to classes that implement SourceFunction

2023-07-28 Thread Alexander Fedulov (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Fedulov updated FLINK-32670:
--
Summary: Cascade deprecation to classes that implement SourceFunction  
(was: Annotate interfaces that inherit from SourceFunction as deprecated )

> Cascade deprecation to classes that implement SourceFunction
> 
>
> Key: FLINK-32670
> URL: https://issues.apache.org/jira/browse/FLINK-32670
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Alexander Fedulov
>Assignee: Alexander Fedulov
>Priority: Major
>  Labels: pull-request-available
>
>  ParallelSourceFunction, RichParallelSourceFunction, ExternallyInducedSource



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32694) Migrate classes that implement ParallelSourceFunction to Source V2 API

2023-07-28 Thread Alexander Fedulov (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Fedulov updated FLINK-32694:
--
Summary: Migrate classes that implement ParallelSourceFunction to Source V2 
API  (was: Cascade deprecation to classes that implement ParallelSourceFunction)

> Migrate classes that implement ParallelSourceFunction to Source V2 API
> --
>
> Key: FLINK-32694
> URL: https://issues.apache.org/jira/browse/FLINK-32694
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common
>Reporter: Alexander Fedulov
>Assignee: Alexander Fedulov
>Priority: Major
>
>  
>  * SimpleEndlessSourceWithBloatedState in HeavyDeploymentStressTestProgram 
> (org.apache.flink.deployment)
>  * StatefulSequenceSource (org.apache.flink.streaming.api.functions.source)
>  * ArrowSourceFunction (org.apache.flink.table.runtime.arrow.sources)
>  * InputFormatSourceFunction (org.apache.flink.streaming.api.functions.source)
>  * EventsGeneratorSource 
> (org.apache.flink.streaming.examples.statemachine.generator)
>  * FromSplittableIteratorFunction 
> (org.apache.flink.streaming.api.functions.source)
>  * DataGeneratorSource 
> (org.apache.flink.streaming.api.functions.source.datagen)
>  * -- Tests:
>  * MySource in StatefulStreamingJob (org.apache.flink.test)
>  * RandomLongSource in StickyAllocationAndLocalRecoveryTestJob 
> (org.apache.flink.streaming.tests)
>  * SequenceGeneratorSource (org.apache.flink.streaming.tests)
>  * TtlStateUpdateSource (org.apache.flink.streaming.tests)
>  * StringSourceFunction in NettyShuffleMemoryControlTestProgram 
> (org.apache.flink.streaming.tests)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-32694) Cascade deprecation to classes that implement ParallelSourceFunction

2023-07-28 Thread Alexander Fedulov (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Fedulov reassigned FLINK-32694:
-

Assignee: Alexander Fedulov

> Cascade deprecation to classes that implement ParallelSourceFunction
> 
>
> Key: FLINK-32694
> URL: https://issues.apache.org/jira/browse/FLINK-32694
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common
>Reporter: Alexander Fedulov
>Assignee: Alexander Fedulov
>Priority: Major
>
>  
>  * SimpleEndlessSourceWithBloatedState in HeavyDeploymentStressTestProgram 
> (org.apache.flink.deployment)
>  * StatefulSequenceSource (org.apache.flink.streaming.api.functions.source)
>  * ArrowSourceFunction (org.apache.flink.table.runtime.arrow.sources)
>  * InputFormatSourceFunction (org.apache.flink.streaming.api.functions.source)
>  * EventsGeneratorSource 
> (org.apache.flink.streaming.examples.statemachine.generator)
>  * FromSplittableIteratorFunction 
> (org.apache.flink.streaming.api.functions.source)
>  * DataGeneratorSource 
> (org.apache.flink.streaming.api.functions.source.datagen)
>  * -- Tests:
>  * MySource in StatefulStreamingJob (org.apache.flink.test)
>  * RandomLongSource in StickyAllocationAndLocalRecoveryTestJob 
> (org.apache.flink.streaming.tests)
>  * SequenceGeneratorSource (org.apache.flink.streaming.tests)
>  * TtlStateUpdateSource (org.apache.flink.streaming.tests)
>  * StringSourceFunction in NettyShuffleMemoryControlTestProgram 
> (org.apache.flink.streaming.tests)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] yunfengzhou-hub commented on a diff in pull request #22931: [FLINK-32514] Support configuring checkpointing interval during process backlog

2023-07-28 Thread via GitHub



yunfengzhou-hub commented on code in PR #22931:
URL: https://github.com/apache/flink/pull/22931#discussion_r1277383109


##
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java:
##
@@ -183,6 +189,15 @@ public class CheckpointCoordinator {
 /** A handle to the current periodic trigger, to cancel it when necessary. 
*/
 private ScheduledFuture currentPeriodicTrigger;
 
+/**
+ * The timestamp (via {@link Clock#relativeTimeMillis()}) when the next 
checkpoint will be
+ * triggered.
+ *
+ * If it's value is {@link Long#MAX_VALUE}, it means there is not a 
next checkpoint
+ * scheduled.
+ */
+private volatile long nextCheckpointTriggeringRelativeTime;

Review Comment:
   `nextCheckpointTriggeringRelativeTime` and `currentPeriodicTrigger` should 
be updated together, so I added `synchronized(lock){}` around code blocks 
related to these variables instead of adding `@GuardedBy("lock")` to each of 
them.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748537#comment-17748537
 ] 

Matthias Pohl edited comment on FLINK-31168 at 7/28/23 10:18 AM:
-

So far, I wasn't able to reproduce it locally with up to now 300 test runs.

Update: The test runs happened with JDK11. I noticed that the two test failures 
in question happened with JDK17. I'm gonna continue looking into the issue in 
that direction.


was (Author: mapohl):
So far, I wasn't able to reproduce it locally with up to now 230 test runs.

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056)
>   ... 4 more
> Caused by: java.lang.RuntimeException: Error while waiting for job to be 
> initialized
>   at 
> org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160)
>   at 
> org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>   at 
> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
>   at 
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
>   at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
>   at 
> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: 
> [org.apache.flink.runtime.rest.NotFoundException: Job 
> 865dcd87f4828dbeb3d93eb52e2636b1 not found
>   at 
> org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$1(AbstractExecutionGraphHandler.java:99)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>   at 
> org.apache.flink.runtime.rest.handler.legacy.DefaultExecutionGraphCache.lambda$getExecutionGraphInternal$0(DefaultExecutionGraphCache.java:109)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
>   at 
>

[jira] [Commented] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748537#comment-17748537
 ] 

Matthias Pohl commented on FLINK-31168:
---

So far, I wasn't able to reproduce it locally with up to now 230 test runs.

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056)
>   ... 4 more
> Caused by: java.lang.RuntimeException: Error while waiting for job to be 
> initialized
>   at 
> org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160)
>   at 
> org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>   at 
> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
>   at 
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
>   at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
>   at 
> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: 
> [org.apache.flink.runtime.rest.NotFoundException: Job 
> 865dcd87f4828dbeb3d93eb52e2636b1 not found
>   at 
> org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$1(AbstractExecutionGraphHandler.java:99)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>   at 
> org.apache.flink.runtime.rest.handler.legacy.DefaultExecutionGraphCache.lambda$getExecutionGraphInternal$0(DefaultExecutionGraphCache.java:109)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>   at 
>

[jira] [Comment Edited] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748533#comment-17748533
 ] 

Matthias Pohl edited comment on FLINK-31168 at 7/28/23 9:49 AM:


It appears that the leader election is not causing the problem: The JobGraph is 
written properly to the JobGraphStore under {{flink/default/jobgraphs}} in the 
JM run #0 (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9106]):
{code}
Jul 17 01:09:58 01:09:47,954 10630 [flink-akka.actor.default-dispatcher-20] 
INFO  org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Added 
JobGraph(jobId: 10ea837b5ff5916e57e0f49298d952d3) to 
ZooKeeperStateHandleStore{namespace='flink/default/jobgraphs'}.
{code}

A leader is picked in JM run #1 that tries to recover the data from the correct 
ZNode {{flink/default/jobgraphs}} (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9917]).
 But it appears that the ZNode is empty:
{code}
Jul 17 01:09:59 01:09:55,657 6184 [cluster-io-thread-2] INFO  
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieved job ids 
[] from ZooKeeperStateHandleStore{namespace='flink/default/jobgraphs'}
Jul 17 01:09:59 01:09:55,657 6184 [cluster-io-thread-2] INFO  
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] - 
Successfully recovered 0 persisted job graphs.
{code}


was (Author: mapohl):
It appears that the leader election is not causing the problem: The JobGraph is 
written properly to the JobGraphStore under {{flink/default/jobgraphs}} in the 
JM run #0 (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9106]):
{code}
INFO  org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Added 
JobGraph(jobId: 10ea837b5ff5916e57e0f49298d952d3) to 
ZooKeeperStateHandleStore{namespace='flink/default/jobgraphs'}.
{code}

A leader is picked in JM run #1 that tries to recover the data from the correct 
ZNode {{flink/default/jobgraphs}} (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9917]).
 But it appears that the ZNode is empty:
{code}
Jul 17 01:09:59 01:09:55,657 6184 [cluster-io-thread-2] INFO  
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieved job ids 
[] from ZooKeeperStateHandleStore{namespace='flink/default/jobgraphs'}
Jul 17 01:09:59 01:09:55,657 6184 [cluster-io-thread-2] INFO  
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] - 
Successfully recovered 0 persisted job graphs.
{code}

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
>

[jira] [Comment Edited] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748533#comment-17748533
 ] 

Matthias Pohl edited comment on FLINK-31168 at 7/28/23 9:48 AM:


It appears that the leader election is not causing the problem: The JobGraph is 
written properly to the JobGraphStore under {{flink/default/jobgraphs}} in the 
JM run #0 (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9106]):
{quote}
INFO  org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Added 
JobGraph(jobId: 10ea837b5ff5916e57e0f49298d952d3) to 
ZooKeeperStateHandleStore{namespace='flink/default/jobgraphs'}.
{quote}

A leader is picked in JM run #1 that tries to recover the data from the correct 
ZNode {{flink/default/jobgraphs}} (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9917]).
 But it appears that the ZNode is empty:
{quote}
Jul 17 01:09:59 01:09:55,657 6184 [cluster-io-thread-2] INFO  
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieved job ids 
[] from ZooKeeperStateHandleStore{namespace='flink/default/jobgraphs'}
Jul 17 01:09:59 01:09:55,657 6184 [cluster-io-thread-2] INFO  
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] - 
Successfully recovered 0 persisted job graphs.
{quote}


was (Author: mapohl):
It appears that the leader election is not causing the problem: The JobGraph is 
written properly to the JobGraphStore under {{flink/default/jobgraphs}} in the 
JM run #0 (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9106])
 and a leader is picked in JM run #1 that tries to recover the data from the 
correct ZNode {{flink/default/jobgraphs}} (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9917]).
 But it appears that the ZNode is empty.

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056)
>   ... 4 more
> Caused by: java.lang.RuntimeException: Error while waiting for job to be 
> initialized
>   at 
> org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160)
>   at 
> org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73)
>   at 
>

[jira] [Comment Edited] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748533#comment-17748533
 ] 

Matthias Pohl edited comment on FLINK-31168 at 7/28/23 9:48 AM:


It appears that the leader election is not causing the problem: The JobGraph is 
written properly to the JobGraphStore under {{flink/default/jobgraphs}} in the 
JM run #0 (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9106]):
{code}
INFO  org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Added 
JobGraph(jobId: 10ea837b5ff5916e57e0f49298d952d3) to 
ZooKeeperStateHandleStore{namespace='flink/default/jobgraphs'}.
{code}

A leader is picked in JM run #1 that tries to recover the data from the correct 
ZNode {{flink/default/jobgraphs}} (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9917]).
 But it appears that the ZNode is empty:
{code}
Jul 17 01:09:59 01:09:55,657 6184 [cluster-io-thread-2] INFO  
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieved job ids 
[] from ZooKeeperStateHandleStore{namespace='flink/default/jobgraphs'}
Jul 17 01:09:59 01:09:55,657 6184 [cluster-io-thread-2] INFO  
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] - 
Successfully recovered 0 persisted job graphs.
{code}


was (Author: mapohl):
It appears that the leader election is not causing the problem: The JobGraph is 
written properly to the JobGraphStore under {{flink/default/jobgraphs}} in the 
JM run #0 (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9106]):
{quote}
INFO  org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Added 
JobGraph(jobId: 10ea837b5ff5916e57e0f49298d952d3) to 
ZooKeeperStateHandleStore{namespace='flink/default/jobgraphs'}.
{quote}

A leader is picked in JM run #1 that tries to recover the data from the correct 
ZNode {{flink/default/jobgraphs}} (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9917]).
 But it appears that the ZNode is empty:
{quote}
Jul 17 01:09:59 01:09:55,657 6184 [cluster-io-thread-2] INFO  
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Retrieved job ids 
[] from ZooKeeperStateHandleStore{namespace='flink/default/jobgraphs'}
Jul 17 01:09:59 01:09:55,657 6184 [cluster-io-thread-2] INFO  
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] - 
Successfully recovered 0 persisted job graphs.
{quote}

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)

[jira] [Commented] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748533#comment-17748533
 ] 

Matthias Pohl commented on FLINK-31168:
---

It appears that the leader election is not causing the problem: The JobGraph is 
written properly to the JobGraphStore under {{flink/default/jobgraphs}} in the 
JM run #0 (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9106])
 and a leader is picked in JM run #1 that tries to recover the data from the 
correct ZNode {{flink/default/jobgraphs}} (see [log line in build 
#51299|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9917]).
 But it appears that the ZNode is empty.

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056)
>   ... 4 more
> Caused by: java.lang.RuntimeException: Error while waiting for job to be 
> initialized
>   at 
> org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160)
>   at 
> org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>   at 
> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
>   at 
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
>   at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
>   at 
> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: 
> [org.apache.flink.runtime.rest.NotFoundException: Job 
> 865dcd87f4828dbeb3d93eb52e2636b1 not found
>   at 
> org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$1(AbstractExecutionGraphHandler.java:99)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
>

[jira] [Created] (FLINK-32710) The LeaderElection component IDs for running is only the JobID which might be confusing in the log output

2023-07-28 Thread Matthias Pohl (Jira)

Matthias Pohl created FLINK-32710:
-

 Summary: The LeaderElection component IDs for running is only the 
JobID which might be confusing in the log output
 Key: FLINK-32710
 URL: https://issues.apache.org/jira/browse/FLINK-32710
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.18.0
Reporter: Matthias Pohl


I noticed that the leader log messages for the jobs use the plain job ID as the 
component ID. That might be confusing when reading the logs since it's a UUID 
with no additional context.

We might want to add a prefix (e.g. {{job-}} to these component IDs.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32710) The LeaderElection component IDs for running is only the JobID which might be confusing in the log output

2023-07-28 Thread Matthias Pohl (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-32710:
--
Issue Type: Improvement  (was: Bug)

> The LeaderElection component IDs for running is only the JobID which might be 
> confusing in the log output
> -
>
> Key: FLINK-32710
> URL: https://issues.apache.org/jira/browse/FLINK-32710
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Matthias Pohl
>Priority: Minor
>  Labels: starter
>
> I noticed that the leader log messages for the jobs use the plain job ID as 
> the component ID. That might be confusing when reading the logs since it's a 
> UUID with no additional context.
> We might want to add a prefix (e.g. {{job-}} to these component IDs.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32710) The LeaderElection component IDs for running is only the JobID which might be confusing in the log output

2023-07-28 Thread Matthias Pohl (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-32710:
--
Labels: starter  (was: )

> The LeaderElection component IDs for running is only the JobID which might be 
> confusing in the log output
> -
>
> Key: FLINK-32710
> URL: https://issues.apache.org/jira/browse/FLINK-32710
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Matthias Pohl
>Priority: Minor
>  Labels: starter
>
> I noticed that the leader log messages for the jobs use the plain job ID as 
> the component ID. That might be confusing when reading the logs since it's a 
> UUID with no additional context.
> We might want to add a prefix (e.g. {{job-}} to these component IDs.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] 1996fanrui commented on pull request #23094: [FLINK-32705][connectors/common] Remove the beta for watermark alignment

2023-07-28 Thread via GitHub



1996fanrui commented on PR #23094:
URL: https://github.com/apache/flink/pull/23094#issuecomment-1655369644

   The actual CI passed, and the uploading log and caching docker image failed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748491#comment-17748491
 ] 

Matthias Pohl edited comment on FLINK-31168 at 7/28/23 9:20 AM:


The most-recent failures seem to have been caused by the job recovery not being 
successful:
* [20230711.1 (#51165) Dispatcher #1 output in line 
8688|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51165=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=8688]
* [20230717.1 (#51299) Dispatcher #1 output in line 
9918|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9918]

I'm gonna raise the priority of this issue to blocker because it could be 
related to the leader election changes.

The job is not picked up anymore and therefore, cannot be saved in the 
{{ExecutionGraphInfoStore}}. The job client will wait for the initialization 
phase to be over and then requests the JobResult which calls 
{{Dispatcher.requestJobStatus}}. {{requestJobStatus}} won't find a 
{{JobManagerRunner}} in {{Dispatcher#jobManagerRunnerRegistry}} and non in the 
{{Dispatcher#executionGraphInfoStore}} (see 
[Dispatcher#requestJobStatus|https://github.com/apache/flink/blob/ab9445aca56e7d139e8fd9bcc23e5f7e06288e66/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L902]).
 Therefore, the response future will complete exceptionally with a 
{{FlinkJobNotFoundException}} causing the error which we're seeing in the last 
two CI failures.


was (Author: mapohl):
The most-recent failures seem to have been caused by the job recovery not being 
successful:
* [20230711.1 (#51165) Dispatcher #1 output in line 
8688|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51165=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=8688]
* [20230717.1 (#51299) Dispatcher #1 output in line 
9918|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9918]

I'm gonna raise the priority of this issue to blocker because it could be 
related to the leader election changes.

The job is not picked up anymore and therefore, cannot be saved in the 
{{ExecutionGraphInfoStore}}. The job client will wait for the initialization 
phase to be over and then requests the JobResult which calls 
{{Dispatcher.requestJobStatus}}. {{requestJobStatus}} won't find a 
{{JobManagerRunner}} in {{Dispatcher#jobManagerRunnerRegistry}} and non in the 
{{Dispatcher#executionGraphInfoStore}} (see 
[Dispatcher#requestJobStatus|https://github.com/apache/flink/blob/ab9445aca56e7d139e8fd9bcc23e5f7e06288e66/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L902].
 Therefore, the response future will complete exceptionally with a 
{{FlinkJobNotFoundException}} causing the error which we're seeing in the last 
two CI failures.

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
>

[jira] [Comment Edited] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748491#comment-17748491
 ] 

Matthias Pohl edited comment on FLINK-31168 at 7/28/23 9:20 AM:


The most-recent failures seem to have been caused by the job recovery not being 
successful:
* [20230711.1 (#51165) Dispatcher #1 output in line 
8688|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51165=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=8688]
* [20230717.1 (#51299) Dispatcher #1 output in line 
9918|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9918]

I'm gonna raise the priority of this issue to blocker because it could be 
related to the leader election changes.

The job is not picked up anymore and therefore, cannot be saved in the 
{{ExecutionGraphInfoStore}}. The job client will wait for the initialization 
phase to be over and then requests the JobResult which calls 
{{Dispatcher.requestJobStatus}}. {{requestJobStatus}} won't find a 
{{JobManagerRunner}} in {{Dispatcher#jobManagerRunnerRegistry}} and non in the 
{{Dispatcher#executionGraphInfoStore}} (see 
[Dispatcher#requestJobStatus|https://github.com/apache/flink/blob/ab9445aca56e7d139e8fd9bcc23e5f7e06288e66/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L902].
 Therefore, the response future will complete exceptionally with a 
{{FlinkJobNotFoundException}} causing the error which we're seeing in the last 
two CI failures.


was (Author: mapohl):
The most-recent failures seem to have been caused by the job recovery not being 
successful:
* [20230711.1 (#51165) Dispatcher #1 output in line 
8688|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51165=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=8688]
* [20230717.1 (#51299) Dispatcher #1 output in line 
9918|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9918]

I'm gonna raise the priority of this issue to blocker because it could be 
related to the leader election changes.

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056)
>   ... 4 more
> Caused by: java.lang.RuntimeException: Error while waiting for job to be 
> initialized
>   at 
> org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160)
>   at 
> org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73)
>   at 
>

[jira] [Updated] (FLINK-32708) Fix the write logic in remote tier of Hybrid Shuffle

2023-07-28 Thread Wencong Liu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wencong Liu updated FLINK-32708:

Summary: Fix the write logic in remote tier of Hybrid Shuffle  (was: Fix 
the write logic in remote tier of hybrid shuffle)

> Fix the write logic in remote tier of Hybrid Shuffle
> 
>
> Key: FLINK-32708
> URL: https://issues.apache.org/jira/browse/FLINK-32708
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.18.0
>Reporter: Wencong Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> Currently, on the writer side in the remote tier, the flag file indicating 
> the latest segment id is updated first, followed by the creation of the data 
> file. This results in an incorrect order of file creation and we should fix 
> it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink-connector-hbase] MartijnVisser commented on pull request #17: [FLINK-28013] Shade all netty dependencies in sql jar

2023-07-28 Thread via GitHub



MartijnVisser commented on PR #17:
URL: 
https://github.com/apache/flink-connector-hbase/pull/17#issuecomment-1655299963

   > this modification is to add netty packages other than 'netty-all', such as 
'netty-transport' here, to the sql jar.
   
   Ah sorry, I misread. I'll try to have a look later today


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-31168:
--
Fix Version/s: 1.18.0

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Major
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056)
>   ... 4 more
> Caused by: java.lang.RuntimeException: Error while waiting for job to be 
> initialized
>   at 
> org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160)
>   at 
> org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>   at 
> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
>   at 
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
>   at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
>   at 
> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: 
> [org.apache.flink.runtime.rest.NotFoundException: Job 
> 865dcd87f4828dbeb3d93eb52e2636b1 not found
>   at 
> org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$1(AbstractExecutionGraphHandler.java:99)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>   at 
> org.apache.flink.runtime.rest.handler.legacy.DefaultExecutionGraphCache.lambda$getExecutionGraphInternal$0(DefaultExecutionGraphCache.java:109)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>   at 
>

[jira] [Updated] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl updated FLINK-31168:
--
Priority: Blocker  (was: Major)

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056)
>   ... 4 more
> Caused by: java.lang.RuntimeException: Error while waiting for job to be 
> initialized
>   at 
> org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160)
>   at 
> org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>   at 
> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
>   at 
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
>   at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
>   at 
> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: 
> [org.apache.flink.runtime.rest.NotFoundException: Job 
> 865dcd87f4828dbeb3d93eb52e2636b1 not found
>   at 
> org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$1(AbstractExecutionGraphHandler.java:99)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>   at 
> org.apache.flink.runtime.rest.handler.legacy.DefaultExecutionGraphCache.lambda$getExecutionGraphInternal$0(DefaultExecutionGraphCache.java:109)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>   at 
>

[jira] [Commented] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748491#comment-17748491
 ] 

Matthias Pohl commented on FLINK-31168:
---

The most-recent failures seem to have been caused by the job recovery not being 
successful:
* [20230711.1 (#51165) Dispatcher #1 output in line 
8688|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51165=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=8688]
* [20230717.1 (#51299) Dispatcher #1 output in line 
9918|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9918]

I'm gonna raise the priority of this issue to blocker because it could be 
related to the leader election changes.

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056)
>   ... 4 more
> Caused by: java.lang.RuntimeException: Error while waiting for job to be 
> initialized
>   at 
> org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160)
>   at 
> org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>   at 
> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
>   at 
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
>   at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
>   at 
> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: 
> [org.apache.flink.runtime.rest.NotFoundException: Job 
> 865dcd87f4828dbeb3d93eb52e2636b1 not found
>   at 
> org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$1(AbstractExecutionGraphHandler.java:99)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>   at 
>

[jira] [Assigned] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster

2023-07-28 Thread Fang Yong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-32667:
-

Assignee: Fang Yong

> Use standalone store and embedding writer for jobs with no-restart-strategy 
> in session cluster
> --
>
> Key: FLINK-32667
> URL: https://issues.apache.org/jira/browse/FLINK-32667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Assignee: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> When a flink session cluster use zk or k8s high availability service, it will 
> store jobs in zk or ConfigMap. When we submit flink olap jobs to the session 
> cluster, they always turn off restart strategy. These jobs with 
> no-restart-strategy should not be stored in zk or ConfigMap in k8s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] flinkbot commented on pull request #23101: [FLINK-32667][dispatcher] Do not save job without restart strategy to job writer and store

2023-07-28 Thread via GitHub



flinkbot commented on PR #23101:
URL: https://github.com/apache/flink/pull/23101#issuecomment-1655290894

   
   ## CI report:
   
   * 9e88a35aafee8645ebbabb3225b24fa8052b4bcd UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #23100: [FLINK-32708][network] Fix the write logic in remote tier of hybrid shuffle

2023-07-28 Thread via GitHub



flinkbot commented on PR #23100:
URL: https://github.com/apache/flink/pull/23100#issuecomment-1655282000

   
   ## CI report:
   
   * 30bae68215a49365875c956eadde2a36929f0aea UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-32667) Use standalone store and embedding writer for jobs with no-restart-strategy in session cluster

2023-07-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-32667:
---
Labels: pull-request-available  (was: )

> Use standalone store and embedding writer for jobs with no-restart-strategy 
> in session cluster
> --
>
> Key: FLINK-32667
> URL: https://issues.apache.org/jira/browse/FLINK-32667
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Fang Yong
>Priority: Major
>  Labels: pull-request-available
>
> When a flink session cluster use zk or k8s high availability service, it will 
> store jobs in zk or ConfigMap. When we submit flink olap jobs to the session 
> cluster, they always turn off restart strategy. These jobs with 
> no-restart-strategy should not be stored in zk or ConfigMap in k8s



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] FangYongs opened a new pull request, #23101: [FLINK-32667][dispatcher] Do not save job without restart strategy to job writer and store

2023-07-28 Thread via GitHub



FangYongs opened a new pull request, #23101:
URL: https://github.com/apache/flink/pull/23101

   ## What is the purpose of the change
   
   This PR aims to not save job which without restart strategy to job writer 
and store.
   In olap scenario, jobs always has no restart strategy. It will increase 
latency if the jobs are saved to zookeeper or configmap for k8s. It is not 
necessary for the jobs without restart strategy to save these information.
   
   ## Brief change log
 - Add DispatcherJobStorage for Dispatcher to manage job writer and store
 - Register job to DispatcherJobStorage if it has no restart strategy
 - Do not save job information if job has no restart strategy
 - Unregister job in DispatcherJobStorage when job reaches termination
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
 - Added `testCleanupWhenJobFinishedWithNoRestart` and 
`testCleanupWhenJobCanceledWithNoRestart` to test register and unregister in 
Dispatcher
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no) no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no) no
 - The serializers: (yes / no / don't know) no
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know) no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) 
no
 - The S3 file system connector: (yes / no / don't know) no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no) no
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-32708) Fix the write logic in remote tier of hybrid shuffle

2023-07-28 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-32708:
---
Labels: pull-request-available  (was: )

> Fix the write logic in remote tier of hybrid shuffle
> 
>
> Key: FLINK-32708
> URL: https://issues.apache.org/jira/browse/FLINK-32708
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.18.0
>Reporter: Wencong Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> Currently, on the writer side in the remote tier, the flag file indicating 
> the latest segment id is updated first, followed by the creation of the data 
> file. This results in an incorrect order of file creation and we should fix 
> it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] WencongLiu opened a new pull request, #23100: [FLINK-32708][network] Fix the write logic in remote tier of hybrid shuffle

2023-07-28 Thread via GitHub



WencongLiu opened a new pull request, #23100:
URL: https://github.com/apache/flink/pull/23100

   ## What is the purpose of the change
   
   *Fix the write logic in remote tier of hybrid shuffle.*
   
   
   ## Brief change log
   
 - *Fix the write logic in remote tier of hybrid shuffle.*
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-32709) Improve memory utilization for Hybrid Shuffle

2023-07-28 Thread Yuxin Tan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-32709:
--
Summary: Improve memory utilization for Hybrid Shuffle  (was: Modify 
segment size to improve memory utilization for Hybrid Shuffle)

> Improve memory utilization for Hybrid Shuffle
> -
>
> Key: FLINK-32709
> URL: https://issues.apache.org/jira/browse/FLINK-32709
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> Currently, each subpartition in Disk/Remote has a segment size of 8M. When 
> writing segments to the Disk tier with a parallelism of 1000, only shuffle 
> data exceeding 1000 * 8M can be written to the Memory tier again. However, 
> for most shuffles, the data volume size falls below this limit, significantly 
> impacting Memory tier utilization. 
> For better performance, it is necessary to address this issue to improve the 
> memory tier utilization.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] LadyForest closed pull request #23065: [FLINK-32656][table] Deprecate ManagedTable related APIs

2023-07-28 Thread via GitHub



LadyForest closed pull request #23065: [FLINK-32656][table] Deprecate 
ManagedTable related APIs
URL: https://github.com/apache/flink/pull/23065


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #23099: [FLINK-32583][rest] Fix deadlock in RestClient (backport to 1.16)

2023-07-28 Thread via GitHub



flinkbot commented on PR #23099:
URL: https://github.com/apache/flink/pull/23099#issuecomment-1655250466

   
   ## CI report:
   
   * eb0bbb685ade3539ddf1ad08755b2db9be306ec3 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-connector-hbase] whhe commented on pull request #17: [FLINK-28013] Shade all netty dependencies in sql jar

2023-07-28 Thread via GitHub



whhe commented on PR #17:
URL: 
https://github.com/apache/flink-connector-hbase/pull/17#issuecomment-1655247084

   > @whhe The connector actually shouldn't rely on `flink-shaded` at all.
   
   It will not depend on 'flink-shaded', this modification is to add netty 
packages other than 'netty-all', such as 'netty-transport' here, to the sql jar.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748445#comment-17748445
 ] 

Matthias Pohl edited comment on FLINK-31168 at 7/28/23 8:03 AM:


For the record: The reported cases do not always have the same cause.
||Build||Branch||Comment||Description||
|[20230221.2 
(#46342)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706]|release-1.15
 (c6b649bf)|Jira issue description|Job was not found during initialization 
phase of the job|
|[20221014.20 
(#42038)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=42038=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=9922]|1.16
 PR|1st report in comments|Error after job finished|
|[20232504.1 
(#48442)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48442=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12170]|master
 (1.18)|2nd report in comments|Error after job finished|
|[20230629.1 
(#50607)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=50607=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9363]|master
 (74ae4b24)|3rd report in comments|Ask timeout|
|[20230711.1 
(#51165)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51165=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9898]|master
 (4cf2124d)|4th report in comments|Error while restarting the dispatcher (job 
not finished)|
|[20230717.1 
(#51299)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=10039]|master
 (4690dc29)|5th report in comments|Error while restarting the dispatcher (job 
not finished)|

[20230711.1 
(#51165)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51165=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9898]
 is the first failure that included the flink-shaded 17.0 upgrade (FLINK-32032).


was (Author: mapohl):
For the record: The reported cases do not always have the same cause.
||Build||Branch||Comment||Description||
|[20230221.2 
(#46342)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706]|release-1.15
 (c6b649bf)|Jira issue description|Job was not found during initialization 
phase of the job|
|[20221014.20 
(#42038)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=42038=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=9922]|1.16
 PR|1st report in comments|Error after job finished|
|[20232504.1 
(#48442)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48442=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12170]|master
 (1.18)|2nd report in comments|Error after job finished|
|[20230629.1 
(#50607)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=50607=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9363]|master
 (74ae4b24)|3rd report in comments|Ask timeout|
|[20230711.1 
(#51165)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51165=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9898]|master
 (4cf2124d)|4th report in comments|Error while restarting the dispatcher (job 
not finished)|
|[20230717.1 
(#51299)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=10039]|master
 (4690dc29)|5th report in comments|Error while restarting the dispatcher (job 
not finished)|

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at

[jira] [Comment Edited] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748445#comment-17748445
 ] 

Matthias Pohl edited comment on FLINK-31168 at 7/28/23 7:37 AM:


For the record: The reported cases do not always have the same cause.
||Build||Branch||Comment||Description||
|[20230221.2 
(#46342)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706]|release-1.15
 (c6b649bf)|Jira issue description|Job was not found during initialization 
phase of the job|
|[20221014.20 
(#42038)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=42038=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=9922]|1.16
 PR|1st report in comments|Error after job finished|
|[20232504.1 
(#48442)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48442=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12170]|master
 (1.18)|2nd report in comments|Error after job finished|
|[20230629.1 
(#50607)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=50607=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9363]|master
 (74ae4b24)|3rd report in comments|Ask timeout|
|[20230711.1 
(#51165)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51165=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9898]|master
 (4cf2124d)|4th report in comments|Error while restarting the dispatcher (job 
not finished)|
|[20230717.1 
(#51299)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=10039]|master
 (4690dc29)|5th report in comments|Error while restarting the dispatcher (job 
not finished)|


was (Author: mapohl):
For the record: The reported cases do not always have the same cause.
||Build||Branch||Comment||Description||
|[20230221.2|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706]|release-1.15
 (c6b649bf)|Jira issue description|Job was not found during initialization 
phase of the job|
|[20221014.20|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=42038=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=9922]|1.16
 PR|1st report in comments|Error after job finished|
|[20232504.1|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48442=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12170]|master
 (1.18)|2nd report in comments|Error after job finished|
|[20230629.1|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=50607=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9363]|master
 (74ae4b24)|3rd report in comments|Ask timeout|
|[20230711.1|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51165=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9898]|master
 (4cf2124d)|4th report in comments|Error while restarting the dispatcher (job 
not finished)|
|[20230717.1|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=10039]|master
 (4690dc29)|5th report in comments|Error while restarting the dispatcher (job 
not finished)|

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)

[jira] [Commented] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748445#comment-17748445
 ] 

Matthias Pohl commented on FLINK-31168:
---

For the record: The reported cases do not always have the same cause.
||Build||Branch||Comment||Description||
|[20230221.2|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706]|release-1.15
 (c6b649bf)|Jira issue description|Job was not found during initialization 
phase of the job|
|[20221014.20|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=42038=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=9922]|1.16
 PR|1st report in comments|Error after job finished|
|[20232504.1|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48442=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12170]|master
 (1.18)|2nd report in comments|Error after job finished|
|[20230629.1|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=50607=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9363]|master
 (74ae4b24)|3rd report in comments|Ask timeout|
|[20230711.1|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51165=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=9898]|master
 (4cf2124d)|4th report in comments|Error while restarting the dispatcher (job 
not finished)|
|[20230717.1|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51299=logs=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3=712ade8c-ca16-5b76-3acd-14df33bc1cb1=10039]|master
 (4690dc29)|5th report in comments|Error while restarting the dispatcher (job 
not finished)|

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056)
>   ... 4 more
> Caused by: java.lang.RuntimeException: Error while waiting for job to be 
> initialized
>   at 
> org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160)
>   at 
> org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>   at 
> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
>   at 
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
>   at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
>   at 
>

[GitHub] [flink-connector-pulsar] tisonkun commented on pull request #56: [FLINK-26203] Basic table factory for Pulsar connector

2023-07-28 Thread via GitHub



tisonkun commented on PR #56:
URL: 
https://github.com/apache/flink-connector-pulsar/pull/56#issuecomment-1655131338

   Support keybytes for now to minimize code change. Following work -
   
   1. Add SQL client E2E test
   2. Add upsert table factory
   3. Revisit key serialization support. I know that users can have biz data in 
keys, but I don't know if it's common. @leonardBang @PatrickRen does Kafka 
supports mapping message key to Flink SQL columns?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748254#comment-17748254
 ] 

Matthias Pohl edited comment on FLINK-31168 at 7/28/23 6:46 AM:


The error appears in the 
[FileExecutionGraphInfoStore#getAvailableJobDetails(JobID)|https://github.com/apache/flink/blob/d78d52b27af2550f50b44349d3ec6dc84b966a8a/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/FileExecutionGraphInfoStore.java#L237]
 where the JobDetails are not cached. The {{FileSystemExecutionGraphInfoStore}} 
uses two caches:
* {{jobDetailsCache}} holds the {{JobDetails}} of each job in cache. It has a 
default expiration time of 3600s (see [jobstore.expiration-time 
default|https://github.com/apache/flink/blob/f3598c50c0d3dcdf8058b01f13b7eb9fc5954f7c/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java#L340]).
 The size of the cache is limited to {{Integer.MAX_VALUE}} entries (see 
[jobstore.max-capacity 
default|https://github.com/apache/flink/blob/f3598c50c0d3dcdf8058b01f13b7eb9fc5954f7c/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java#L350]).
 Removing entries from this cache will trigger the removal of the file and 
invalidates the corresponding entries in both caches. Removal can happen even 
earlier, though
* {{executionGraphInfoCache}} holds the {{ExecutionGraphInfo}} in the cache. 
It's limited to 50MB (see [jobstore.cache-size 
default|https://github.com/apache/flink/blob/f3598c50c0d3dcdf8058b01f13b7eb9fc5954f7c/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java#L331]).
 The ExecutionGraphInfo will be reloaded from disk in case it was removed from 
the cache before.

This error can be simulated by reducing either the expiry time of the 
{{jobDetailsCache}} or decreasing the entry limit of that cache.

One suspicion is that the entry gets removed earlier. The 
[CacheBuilder#maximumSize 
JavaDoc|https://guava.dev/releases/19.0/api/docs/com/google/common/cache/CacheBuilder.html#maximumSize(long)]
 states that this can happen:
{quote}
Note that the cache may evict an entry before this limit is exceeded.
{quote}

We should investigate whether the error started to appear with the 
{{flink-shaded}} update (FLINK-30772, FLINK-32032) where the update might have 
included a change in how the cache operates. I will continue investigating this 
tomorrow.


was (Author: mapohl):
The error appears in the 
[FileExecutionGraphInfoStore#getAvailableJobDetails(JobID)|https://github.com/apache/flink/blob/d78d52b27af2550f50b44349d3ec6dc84b966a8a/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/FileExecutionGraphInfoStore.java#L237]
 where the JobDetails are not cached. The {{FileSystemExecutionGraphInfoStore}} 
uses two caches:
* {{jobDetailsCache}} holds the {{JobDetails}} of each job in cache. It has a 
default expiration time of 3600s (see [jobstore.expiration-time 
default|https://github.com/apache/flink/blob/f3598c50c0d3dcdf8058b01f13b7eb9fc5954f7c/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java#L340]).
 The size of the cache is limited to {{Integer.MAX_VALUE}} entries (see 
[jobstore.max-capacity 
default|https://github.com/apache/flink/blob/f3598c50c0d3dcdf8058b01f13b7eb9fc5954f7c/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java#L350]).
 Removing entries from this cache will trigger the removal of the file and 
invalidates the corresponding entries in both caches. Removal can happen even 
earlier, though
* {{executionGraphInfoCache}} holds the {{ExecutionGraphInfo}} in the cache. 
It's limited to 50MB (see [jobstore.cache-size 
default|https://github.com/apache/flink/blob/f3598c50c0d3dcdf8058b01f13b7eb9fc5954f7c/flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java#L331]).
 The ExecutionGraphInfo will be reloaded from disk in case it was removed from 
the cache before.

This error can be simulated by reducing either the expiry time of the 
{{jobDetailsCache}} or decreasing the entry limit of that cache.

One suspicion is that the entry gets removed earlier. The 
[CacheBuilder#maximumSize 
JavaDoc|https://guava.dev/releases/19.0/api/docs/com/google/common/cache/CacheBuilder.html#maximumSize(long)]
 states that this can happen:
{quote}
Note that the cache may evict an entry before this limit is exceeded.
{quote}

We should investigate whether the error started to appear with the 
{{flink-shaded}} update where the update might have included a change in how 
the cache operates. I will continue investigating this tomorrow.

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink

[jira] [Commented] (FLINK-31168) JobManagerHAProcessFailureRecoveryITCase failed due to job not being found

2023-07-28 Thread Matthias Pohl (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748436#comment-17748436
 ] 

Matthias Pohl commented on FLINK-31168:
---

My suspicion doesn't seem to hold: I couldn't identify any change between guava 
30.1 and guava 31.1 (I checked the sources and the release notes) that would 
have been an explanation for different eviction behavior. Additionally, the 
[CacheBuilder#maximumSize 
JavaDoc|https://guava.dev/releases/19.0/api/docs/com/google/common/cache/CacheBuilder.html#maximumSize(long)]
 states that entries are only evicted if the cache is getting close to its 
limit (which is {{Integer.MAX_VALUE}}). The test itself only runs a single job 
which would lead to one entry in the cache.

> JobManagerHAProcessFailureRecoveryITCase failed due to job not being found
> --
>
> Key: FLINK-31168
> URL: https://issues.apache.org/jira/browse/FLINK-31168
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.3, 1.16.1, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Matthias Pohl
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46342=logs=b0a398c0-685b-599c-eb57-c8c2a771138e=747432ad-a576-5911-1e2a-68c6bedc248a=12706
> We see this build failure because a job couldn't be found:
> {code}
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:319)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1061)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:958)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:942)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase.testJobManagerFailure(JobManagerHAProcessFailureRecoveryITCase.java:235)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureRecoveryITCase$4.run(JobManagerHAProcessFailureRecoveryITCase.java:336)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Error while waiting for job to be initialized
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1056)
>   ... 4 more
> Caused by: java.lang.RuntimeException: Error while waiting for job to be 
> initialized
>   at 
> org.apache.flink.client.ClientUtils.waitUntilJobInitializationFinished(ClientUtils.java:160)
>   at 
> org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.lambda$execute$2(AbstractSessionClusterExecutor.java:82)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:73)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>   at 
> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
>   at 
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
>   at 
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
>   at 
> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
>   at 
> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.rest.util.RestClientException: 
> [org.apache.flink.runtime.rest.NotFoundException: Job 
> 865dcd87f4828dbeb3d93eb52e2636b1 not found
>   at 
> org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$1(AbstractExecutionGraphHandler.java:99)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
>   at 
> org.apache.flink.runtime.rest.handler.legacy.DefaultExecutionGraphCache.lambda$getExecutionGraphInternal$0(DefaultExecutionGraphCache.java:109)

[jira] (FLINK-26541) SQL Client should support submitting SQL jobs in application mode

2023-07-28 Thread shizhengchao (Jira)



[ https://issues.apache.org/jira/browse/FLINK-26541 ]


shizhengchao deleted comment on FLINK-26541:
--

was (Author: tinny):
Why do not use session mode？If application mode supports sql, then I think 
there is no different whith session mode.

> SQL Client should support submitting SQL jobs in application mode
> -
>
> Key: FLINK-26541
> URL: https://issues.apache.org/jira/browse/FLINK-26541
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes, Deployment / YARN, Table SQL / 
> Client
>Reporter: Jark Wu
>Priority: Major
>
> Currently, the SQL Client only supports submitting jobs in session mode and 
> per-job mode. As the community going to drop the per-job mode (FLINK-26000), 
> SQL Client should support application mode as well. Otherwise, SQL Client can 
> only submit SQL in session mode then, but streaming jobs should be submitted 
> in per-job or application mode to have bettter resource isolation.
> Disucssions: https://lists.apache.org/thread/2yq351nb721x23rz1q8qlyf2tqrk147r



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[GitHub] [flink] hanyuzheng7 closed pull request #23098: [FLINK-32262] Add MAP_ENTRIES support

2023-07-28 Thread via GitHub



hanyuzheng7 closed pull request #23098: [FLINK-32262] Add MAP_ENTRIES support 
URL: https://github.com/apache/flink/pull/23098


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #23098: [FLINK-32262] Add MAP_ENTRIES support

2023-07-28 Thread via GitHub



flinkbot commented on PR #23098:
URL: https://github.com/apache/flink/pull/23098#issuecomment-1655074207

   
   ## CI report:
   
   * 2a0667be43d20203ef1c1b05119c87c1efbd9030 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

70 matches

Mail list logo