from:"Sergio Sainz \(Jira\)"

[jira] [Commented] (FLINK-31974) JobManager crashes after KubernetesClientException exception with FatalExitExceptionHandler

2023-05-04 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719525#comment-17719525
 ] 

Sergio Sainz commented on FLINK-31974:
--

Hi [~mapohl] - let me setup a new cluster later on to get the full logs. Below 
please find the thread dump from the Flink 1.17.0 crash:

 
{code:java}
2023-04-28 20:50:50,305 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] 
- Received resource requirements from job 0a97c80a173b7ebb619c5b030b607520: 
[ResourceRequirement{resourceProfile=ResourceProfile{UNKNOWN}, 
numberOfRequiredSlots=1}]
...
2023-04-28 20:50:50,534 ERROR org.apache.flink.util.FatalExitExceptionHandler   
           [] - FATAL: Thread 'flink-akka.actor.default-dispatcher-15' produced 
an uncaught exception. Stopping the process...
java.util.concurrent.CompletionException: 
org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException:
 Failure executing: POST at: 
https://10.96.0.1/api/v1/namespaces/env-my-namespace/pods. Message: 
Forbidden!Configured service account doesn't have access. Service account may 
have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-2" is 
forbidden: exceeded quota: my-namespace-realtime-server-resource-quota, 
requested: limits.cpu=3, used: limits.cpu=12100m, limited: limits.cpu=13.
        at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown 
Source) ~[?:?]
        at java.util.concurrent.CompletableFuture.completeThrowable(Unknown 
Source) ~[?:?]
        at java.util.concurrent.CompletableFuture$AsyncRun.run(Unknown Source) 
~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
~[?:?]
        at java.lang.Thread.run(Unknown Source) ~[?:?]
Caused by: 
org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException:
 Failure executing: POST at: 
https://10.96.0.1/api/v1/namespaces/env-my-namespace/pods. Message: 
Forbidden!Configured service account doesn't have access. Service account may 
have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-2" is 
forbidden: exceeded quota: my-namespace-realtime-server-resource-quota, 
requested: limits.cpu=3, used: limits.cpu=12100m, limited: limits.cpu=13.
        at 
org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:684)
 ~[flink-dist-1.17.0.jar:1.17.0]
        at 
org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:664)
 ~[flink-dist-1.17.0.jar:1.17.0]
        at 
org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:613)
 ~[flink-dist-1.17.0.jar:1.17.0]
        at 
org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558)
 ~[flink-dist-1.17.0.jar:1.17.0]
        at 
org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521)
 ~[flink-dist-1.17.0.jar:1.17.0]
        at 
org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:308)
 ~[flink-dist-1.17.0.jar:1.17.0]
        at 
org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:644)
 ~[flink-dist-1.17.0.jar:1.17.0]
        at 
org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:83)
 ~[flink-dist-1.17.0.jar:1.17.0]
        at 
org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61)
 ~[flink-dist-1.17.0.jar:1.17.0]
        at 
org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.lambda$createTaskManagerPod$1(Fabric8FlinkKubeClient.java:163)
 ~[flink-dist-1.17.0.jar:1.17.0]
        ... 4 more
2023-04-28 20:50:50,602 ERROR org.apache.flink.util.FatalExitExceptionHandler   
           [] - Thread dump: 
"main" prio=5 Id=1 WAITING on 
java.util.concurrent.CompletableFuture$Signaller@2897b146
        at java.base@11.0.19/jdk.internal.misc.Unsafe.park(Native Method)
        -  waiting on java.util.concurrent.CompletableFuture$Signaller@2897b146
        at 
java.base@11.0.19/java.util.concurrent.locks.LockSupport.park(Unknown Source)
        at 
java.base@11.0.19/java.util.concurrent.CompletableFuture$Signaller.block(Unknown
 Source)
        at 
java.base@11.0.19/java.util.concurrent.ForkJoinPool.managedBlock(Unknown Source)
        at 
java.base@11.0.19/java.util.concurrent.CompletableFuture.waitingGet(Unknown 
Source)
        at

[jira] [Created] (FLINK-31978) flink-connector-jdbc v.1.17.0 not published

2023-05-01 Thread Sergio Sainz (Jira)

Sergio Sainz created FLINK-31978:


 Summary: flink-connector-jdbc v.1.17.0 not published
 Key: FLINK-31978
 URL: https://issues.apache.org/jira/browse/FLINK-31978
 Project: Flink
  Issue Type: Bug
  Components: Connectors / JDBC
Affects Versions: 1.17.0
Reporter: Sergio Sainz


The connector flink-connector-jdbc version 1.17.0 is not being published in 
public maven repositories:

 

[https://mvnrepository.com/artifact/org.apache.flink/flink-connector-jdbc]

 

Maybe related to FLINK-29642?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-31974) JobManager crashes after KubernetesClientException exception with FatalExitExceptionHandler

2023-04-28 Thread Sergio Sainz (Jira)

Sergio Sainz created FLINK-31974:


 Summary: JobManager crashes after KubernetesClientException 
exception with FatalExitExceptionHandler
 Key: FLINK-31974
 URL: https://issues.apache.org/jira/browse/FLINK-31974
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes
Affects Versions: 1.17.0
Reporter: Sergio Sainz


When resource quota limit is reached JobManager will throw

 

org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException:
 Failure executing: POST at: 
https://10.96.0.1/api/v1/namespaces/my-namespace/pods. Message: 
Forbidden!Configured service account doesn't have access. Service account may 
have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-2" is 
forbidden: exceeded quota: my-namespace-resource-quota, requested: 
limits.cpu=3, used: limits.cpu=12100m, limited: limits.cpu=13.

 

In {*}1.16.1 , this is handled gracefully{*}:

2023-04-28 22:07:24,631 WARN  
org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - 
Failed requesting worker with resource spec WorkerResourceSpec \{cpuCores=1.0, 
taskHeapSize=25.600mb (26843542 bytes), taskOffHeapSize=0 bytes, 
networkMemSize=64.000mb (67108864 bytes), managedMemSize=230.400mb (241591914 
bytes), numSlots=4}, current pending count: 0
java.util.concurrent.CompletionException: 
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST 
at: https://10.96.0.1/api/v1/namespaces/my-namespace/pods. Message: 
Forbidden!Configured service account doesn't have access. Service account may 
have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-138" is 
forbidden: exceeded quota: my-namespace-resource-quota, requested: 
limits.cpu=3, used: limits.cpu=12100m, limited: limits.cpu=13.
        at java.util.concurrent.CompletableFuture.encodeThrowable(Unknown 
Source) ~[?:?]
        at java.util.concurrent.CompletableFuture.completeThrowable(Unknown 
Source) ~[?:?]
        at java.util.concurrent.CompletableFuture$AsyncRun.run(Unknown Source) 
~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
~[?:?]
        at java.lang.Thread.run(Unknown Source) ~[?:?]

aused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure 
executing: POST at: https://10.96.0.1/api/v1/namespaces/my-namespace/pods. 
Message: Forbidden!Configured service account doesn't have access. Service 
account may have been revoked. pods 
"my-namespace-flink-cluster-taskmanager-1-138" is forbidden: exceeded quota: 
my-namespace-resource-quota, requested: limits.cpu=3, used: limits.cpu=12100m, 
limited: limits.cpu=13.
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:684)
 ~[flink-dist-1.16.1.jar:1.16.1]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:664)
 ~[flink-dist-1.16.1.jar:1.16.1]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:613)
 ~[flink-dist-1.16.1.jar:1.16.1]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:558)
 ~[flink-dist-1.16.1.jar:1.16.1]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:521)
 ~[flink-dist-1.16.1.jar:1.16.1]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:308)
 ~[flink-dist-1.16.1.jar:1.16.1]
        at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:644)
 ~[flink-dist-1.16.1.jar:1.16.1]
        at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:83)
 ~[flink-dist-1.16.1.jar:1.16.1]
        at 
io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61)
 ~[flink-dist-1.16.1.jar:1.16.1]
        at 
org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.lambda$createTaskManagerPod$1(Fabric8FlinkKubeClient.java:163)
 ~[flink-dist-1.16.1.jar:1.16.1]
        ... 4 more

 

 

But , {*}in Flink 1.17.0 , Job Manager crashes{*}:

2023-04-28 20:50:50,534 ERROR org.apache.flink.util.FatalExitExceptionHandler   
           [] - FATAL: Thread 'flink-akka.actor.default-dispatcher-15' produced 
an uncaught exception. Stopping the process...
java.util.concurrent.CompletionException: 
org.apache.flink.kubernetes.shaded.io.fabric8.kubernetes.client.KubernetesClientException:
 Failure executing: POST at: 
https://10.96.0.1/api/v1/namespaces/my-namespace/pods. Message: 
Forbidden!Configured service account doesn't have access. Service account may 
have been revoked. pods "my-namespace-flink-cluster-taskmanager-1-2" is 
forbidden: exceeded quota:

[jira] [Updated] (FLINK-31796) Support service mesh istio with Flink kubernetes (both native and operator) for secure communications

2023-04-13 Thread Sergio Sainz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Sainz updated FLINK-31796:
-
Description: 
Currently Flink Native Kubernetes does not support istio + TLS cleanly : Flink 
assumes that pods will be able to communicate by ip-address, meanwhile istio + 
TLS does not allow routing by pod's ip-address.

This ticket is to track effort to support Flink with istio. Some workaround is 
to disable the istio sidecar container, but this is not secure 
[https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]. Akka 
allows to  secure the channel manually, but there is no documentation how to do 
this in the context of Flink. One potential solution for this is to have the 
documentation about how to configure akka cluster + TLS in flink.

For example when using native kubernetes deployment mode with high-availability 
(HA), and when new TaskManager pod is started to process a job, the TaskManager 
pod will attempt to register itself to the resource manager (JobManager). the 
TaskManager looks up the resource manager per ip-address 
(akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1).

Other affected features is metric collection.

Please see FLINK-31775 and FLINK-28171. Especially comment "{_}Flink currently 
just doesn't support Istio.{_}"

  was:
Currently Flink Native Kubernetes does not support istio + TLS cleanly : Flink 
assumes that pods will be able to communicate by ip-address, meanwhile istio + 
TLS does not allow routing by pod's ip-address.

This ticket is to track effort to support Flink with istio. Some workaround is 
to disable the istio sidecar container, but this is not secure 
[https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]. Akka 
allows to  secure the channel manually, but there is no documentation how to do 
this in the context of Flink. One potential solution for this is to have the 
documentation about how to configure akka cluster + TLS in flink.

For example when using native kubernetes deployment mode with high-availability 
(HA), and when new TaskManager pod is started to process a job, the TaskManager 
pod will attempt to register itself to the resource manager (JobManager). the 
TaskManager looks up the resource manager per ip-address 
(akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1).

Other affected features is metric collection.

Please see FLINK-31775 and FLINK-28171. Especially comment "{_}Flink currently 
just doesn't support Istio. If anything, it's a new feature/improvement.{_} "


> Support service mesh istio with Flink kubernetes (both native and operator) 
> for secure communications
> -
>
> Key: FLINK-31796
> URL: https://issues.apache.org/jira/browse/FLINK-31796
> Project: Flink
>  Issue Type: New Feature
>  Components: Deployment / Kubernetes
>Affects Versions: 1.17.0
>Reporter: Sergio Sainz
>Priority: Major
>
> Currently Flink Native Kubernetes does not support istio + TLS cleanly : 
> Flink assumes that pods will be able to communicate by ip-address, meanwhile 
> istio + TLS does not allow routing by pod's ip-address.
> This ticket is to track effort to support Flink with istio. Some workaround 
> is to disable the istio sidecar container, but this is not secure 
> [https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]. Akka 
> allows to  secure the channel manually, but there is no documentation how to 
> do this in the context of Flink. One potential solution for this is to have 
> the documentation about how to configure akka cluster + TLS in flink.
> For example when using native kubernetes deployment mode with 
> high-availability (HA), and when new TaskManager pod is started to process a 
> job, the TaskManager pod will attempt to register itself to the resource 
> manager (JobManager). the TaskManager looks up the resource manager per 
> ip-address (akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1).
> Other affected features is metric collection.
> Please see FLINK-31775 and FLINK-28171. Especially comment "{_}Flink 
> currently just doesn't support Istio.{_}"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-31796) Support service mesh istio with Flink kubernetes (both native and operator) for secure communications

2023-04-13 Thread Sergio Sainz (Jira)

Sergio Sainz created FLINK-31796:


 Summary: Support service mesh istio with Flink kubernetes (both 
native and operator) for secure communications
 Key: FLINK-31796
 URL: https://issues.apache.org/jira/browse/FLINK-31796
 Project: Flink
  Issue Type: New Feature
  Components: Deployment / Kubernetes
Affects Versions: 1.17.0
Reporter: Sergio Sainz


Currently Flink Native Kubernetes does not support istio + TLS cleanly : Flink 
assumes that pods will be able to communicate by ip-address, meanwhile istio + 
TLS does not allow routing by pod's ip-address.

This ticket is to track effort to support Flink with istio. Some workaround is 
to disable the istio sidecar container, but this is not secure 
[https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]. Akka 
allows to  secure the channel manually, but there is no documentation how to do 
this in the context of Flink. One potential solution for this is to have the 
documentation about how to configure akka cluster + TLS in flink.

For example when using native kubernetes deployment mode with high-availability 
(HA), and when new TaskManager pod is started to process a job, the TaskManager 
pod will attempt to register itself to the resource manager (JobManager). the 
TaskManager looks up the resource manager per ip-address 
(akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1).

Other affected features is metric collection.

Please see FLINK-31775 and FLINK-28171. Especially comment "{_}Flink currently 
just doesn't support Istio. If anything, it's a new feature/improvement.{_} "



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

2023-04-12 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17711440#comment-17711440
 ] 

Sergio Sainz edited comment on FLINK-31775 at 4/12/23 3:18 PM:
---

Another confusion about how the fix for  FLINK-28171 can help this case is 
becase *appProtocol* applies to services. Meanwhile this defect is about 
ip-addresses.

Could not resolve ResourceManager address 
akka.tcp://flink@{*}192.168.140.164:6123{*}/user/rpc/resourcemanager_1,

Then, we are not sure adding appProtocol could solve the ip-address routing (as 
there is no service involved)

 


was (Author: sergiosp):
Another confusion about how the fix for  FLINK-28171 can help this case is 
becase *appProtocol* affects services. Meanwhile this defect is about 
ip-addresses.

Could not resolve ResourceManager address 
akka.tcp://flink@{*}192.168.140.164:6123{*}/user/rpc/resourcemanager_1,

Then, we are not sure adding appProtocol could solve the ip-address routing (as 
there is no service involved)

 

> High-Availability not supported in kubernetes when istio enabled
> 
>
> Key: FLINK-31775
> URL: https://issues.apache.org/jira/browse/FLINK-31775
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.16.1
>Reporter: Sergio Sainz
>Priority: Major
>
> When using native kubernetes deployment mode with high-availability (HA), and 
> when new TaskManager pod is started to process a job, the TaskManager pod 
> will attempt to register itself to the resource manager (JobManager). the 
> TaskManager looks up the resource manager per ip-address 
> (akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)
>  
> Nevertheless when istio is enabled, the resolution by ip address is blocked, 
> and hence we see that the job cannot start because task manager cannot 
> register with the resource manager:
> 2023-04-10 23:24:19,752 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
> resolve ResourceManager address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
> 1 ms: Could not connect to rpc endpoint under address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.
>  
> Notice that when HA is disabled, the resolution of the resource manager is 
> made by service name and so the resource manager can be found
>  
> 2023-04-11 00:49:34,162 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
> registration at resource manager 
> akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_*
>  under registration id 83ad942597f86aa880ee96f1c2b8b923.
>  
> Notice in my case , it is not possible to disable istio as explained here: 
> [https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]
>  
> Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , 
> logging as separate defect as I believe the fix of FLINK-28171 won't fix this 
> case. FLINK-28171  is about Flink Kubernetes Operator and this is about 
> native kubernetes deployment.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

2023-04-12 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17711440#comment-17711440
 ] 

Sergio Sainz commented on FLINK-31775:
--

Another confusion about how the fix for  FLINK-28171 can help this case is 
becase *appProtocol* affects services. Meanwhile this defect is about 
ip-addresses.

Could not resolve ResourceManager address 
akka.tcp://flink@{*}192.168.140.164:6123{*}/user/rpc/resourcemanager_1,

Then, we are not sure adding appProtocol could solve the ip-address routing (as 
there is no service involved)

 

> High-Availability not supported in kubernetes when istio enabled
> 
>
> Key: FLINK-31775
> URL: https://issues.apache.org/jira/browse/FLINK-31775
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.16.1
>Reporter: Sergio Sainz
>Priority: Major
>
> When using native kubernetes deployment mode with high-availability (HA), and 
> when new TaskManager pod is started to process a job, the TaskManager pod 
> will attempt to register itself to the resource manager (JobManager). the 
> TaskManager looks up the resource manager per ip-address 
> (akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)
>  
> Nevertheless when istio is enabled, the resolution by ip address is blocked, 
> and hence we see that the job cannot start because task manager cannot 
> register with the resource manager:
> 2023-04-10 23:24:19,752 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
> resolve ResourceManager address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
> 1 ms: Could not connect to rpc endpoint under address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.
>  
> Notice that when HA is disabled, the resolution of the resource manager is 
> made by service name and so the resource manager can be found
>  
> 2023-04-11 00:49:34,162 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
> registration at resource manager 
> akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_*
>  under registration id 83ad942597f86aa880ee96f1c2b8b923.
>  
> Notice in my case , it is not possible to disable istio as explained here: 
> [https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]
>  
> Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , 
> logging as separate defect as I believe the fix of FLINK-28171 won't fix this 
> case. FLINK-28171  is about Flink Kubernetes Operator and this is about 
> native kubernetes deployment.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

2023-04-12 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17711413#comment-17711413
 ] 

Sergio Sainz edited comment on FLINK-31775 at 4/12/23 2:32 PM:
---

Hi [~huwh] , [~martijnvisser] 

Thanks for the help triaging !

 

One question on the solution of adding "appProtocol" into the podTemplate from 
FLINK-28171. Even after we set the appProtocol, do we need to bypass the istio 
sidecar?

 

Because we could not bypass istio sidecar in our case ~


was (Author: sergiosp):
Hi [~huwh] , [~martijnvisser] 

Thanks for the help triaging !

 

One question on the solution of adding "appProtocol" into the podTemplate from 
FLINK-28171. Even after we set the appProtocol, do we need to bypass the istio 
sidecar?

> High-Availability not supported in kubernetes when istio enabled
> 
>
> Key: FLINK-31775
> URL: https://issues.apache.org/jira/browse/FLINK-31775
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.16.1
>Reporter: Sergio Sainz
>Priority: Major
>
> When using native kubernetes deployment mode with high-availability (HA), and 
> when new TaskManager pod is started to process a job, the TaskManager pod 
> will attempt to register itself to the resource manager (JobManager). the 
> TaskManager looks up the resource manager per ip-address 
> (akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)
>  
> Nevertheless when istio is enabled, the resolution by ip address is blocked, 
> and hence we see that the job cannot start because task manager cannot 
> register with the resource manager:
> 2023-04-10 23:24:19,752 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
> resolve ResourceManager address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
> 1 ms: Could not connect to rpc endpoint under address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.
>  
> Notice that when HA is disabled, the resolution of the resource manager is 
> made by service name and so the resource manager can be found
>  
> 2023-04-11 00:49:34,162 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
> registration at resource manager 
> akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_*
>  under registration id 83ad942597f86aa880ee96f1c2b8b923.
>  
> Notice in my case , it is not possible to disable istio as explained here: 
> [https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]
>  
> Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , 
> logging as separate defect as I believe the fix of FLINK-28171 won't fix this 
> case. FLINK-28171  is about Flink Kubernetes Operator and this is about 
> native kubernetes deployment.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

2023-04-12 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17711413#comment-17711413
 ] 

Sergio Sainz edited comment on FLINK-31775 at 4/12/23 2:13 PM:
---

Hi [~huwh] , [~martijnvisser] 

Thanks for the help triaging !

 

One question on the solution of adding "appProtocol" into the podTemplate from 
FLINK-28171. Even after we set the appProtocol, do we need to bypass the istio 
sidecar?


was (Author: sergiosp):
Hi [~huwh] , [~martijnvisser] 

Thanks for the help triaging !

 

One question on the solution of adding "appProtocol" into the podTemplate from 
FLINK-28171. Even after we set the appProtocol, do we need to bypass the istio 
sidecar?

 

 

 

 

> High-Availability not supported in kubernetes when istio enabled
> 
>
> Key: FLINK-31775
> URL: https://issues.apache.org/jira/browse/FLINK-31775
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.16.1
>Reporter: Sergio Sainz
>Priority: Major
>
> When using native kubernetes deployment mode with high-availability (HA), and 
> when new TaskManager pod is started to process a job, the TaskManager pod 
> will attempt to register itself to the resource manager (JobManager). the 
> TaskManager looks up the resource manager per ip-address 
> (akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)
>  
> Nevertheless when istio is enabled, the resolution by ip address is blocked, 
> and hence we see that the job cannot start because task manager cannot 
> register with the resource manager:
> 2023-04-10 23:24:19,752 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
> resolve ResourceManager address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
> 1 ms: Could not connect to rpc endpoint under address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.
>  
> Notice that when HA is disabled, the resolution of the resource manager is 
> made by service name and so the resource manager can be found
>  
> 2023-04-11 00:49:34,162 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
> registration at resource manager 
> akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_*
>  under registration id 83ad942597f86aa880ee96f1c2b8b923.
>  
> Notice in my case , it is not possible to disable istio as explained here: 
> [https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]
>  
> Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , 
> logging as separate defect as I believe the fix of FLINK-28171 won't fix this 
> case. FLINK-28171  is about Flink Kubernetes Operator and this is about 
> native kubernetes deployment.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

2023-04-12 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-31775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17711413#comment-17711413
 ] 

Sergio Sainz commented on FLINK-31775:
--

Hi [~huwh] , [~martijnvisser] 

Thanks for the help triaging !

 

One question on the solution of adding "appProtocol" into the podTemplate from 
FLINK-28171. Even after we set the appProtocol, do we need to bypass the istio 
sidecar?

 

 

 

 

> High-Availability not supported in kubernetes when istio enabled
> 
>
> Key: FLINK-31775
> URL: https://issues.apache.org/jira/browse/FLINK-31775
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.16.1
>Reporter: Sergio Sainz
>Priority: Major
>
> When using native kubernetes deployment mode with high-availability (HA), and 
> when new TaskManager pod is started to process a job, the TaskManager pod 
> will attempt to register itself to the resource manager (JobManager). the 
> TaskManager looks up the resource manager per ip-address 
> (akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)
>  
> Nevertheless when istio is enabled, the resolution by ip address is blocked, 
> and hence we see that the job cannot start because task manager cannot 
> register with the resource manager:
> 2023-04-10 23:24:19,752 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
> resolve ResourceManager address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
> 1 ms: Could not connect to rpc endpoint under address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.
>  
> Notice that when HA is disabled, the resolution of the resource manager is 
> made by service name and so the resource manager can be found
>  
> 2023-04-11 00:49:34,162 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
> registration at resource manager 
> akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_*
>  under registration id 83ad942597f86aa880ee96f1c2b8b923.
>  
> Notice in my case , it is not possible to disable istio as explained here: 
> [https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]
>  
> Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , 
> logging as separate defect as I believe the fix of FLINK-28171 won't fix this 
> case. FLINK-28171  is about Flink Kubernetes Operator and this is about 
> native kubernetes deployment.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-28171) Adjust Job and Task manager port definitions to work with Istio+mTLS

2023-04-11 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-28171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17710375#comment-17710375
 ] 

Sergio Sainz edited comment on FLINK-28171 at 4/11/23 11:46 PM:


Hello [~wangyang0918] ! also experiencing this issue with native kubernetes 
deployment with HA enabled (not in flink k8s operator).

In [https://lists.apache.org/thread/yl40s9069wksz66qlf9t6jhmwsn59zft] you 
mentioned that "If HA enabled, the internal jobmanager rpc service will not be 
created. Instead, the TaskManager retrieves the JobManager address via HA 
services and connects to it via pod ip."

Do you know whether we can change the way TaskManager connects to HA services: 
do not use ip address and instead use pod name?

I think we cannot add the akka workaround of bypassing the istio sidecar. 
Alternatively, do you know how to add TLS encryption to this channel 
(TaskManager->HA service) manually? Thanks for the info ~

 


was (Author: sergiosp):
Hello [~wangyang0918] ! also experiencing this issue with native kubernetes 
deployment with HA enabled (not in flink k8s operator).

In [https://lists.apache.org/thread/yl40s9069wksz66qlf9t6jhmwsn59zft] you 
mentioned that "If HA enabled, the internal jobmanager rpc service will not be 
created. Instead, the TaskManager retrieves the JobManager address via HA 
services and connects to it via pod ip."

Do you know whether we can change the way TaskManager connects to HA services: 
do not use ip address and instead use pod name?

I think we cannot add the akka workaround of bypassing the istio sidecar. 
Thanks for the info ~

 

> Adjust Job and Task manager port definitions to work with Istio+mTLS
> 
>
> Key: FLINK-28171
> URL: https://issues.apache.org/jira/browse/FLINK-28171
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.4
> Environment: flink-kubernetes-operator 1.0.0
> Flink 1.14-java11
> Kubernetes v1.19.5
> Istio 1.7.6
>Reporter: Moshe Elisha
>Assignee: Moshe Elisha
>Priority: Major
>  Labels: pull-request-available
>
> Hello,
>  
> We are launching Flink deployments using the [Flink Kubernetes 
> Operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-stable/]
>  on a Kubernetes cluster with Istio and mTLS enabled.
>  
> We found that the TaskManager is unable to communicate with the JobManager on 
> the jobmanager-rpc port:
>  
> {{2022-06-15 15:25:40,508 WARN  akka.remote.ReliableDeliverySupervisor        
>                [] - Association with remote system 
> [akka.tcp://[flink@amf-events-to-inference-and-central.nwdaf-edge|mailto:flink@amf-events-to-inference-and-central.nwdaf-edge]:6123]
>  has failed, address is now gated for [50] ms. Reason: [Association failed 
> with 
> [akka.tcp://[flink@amf-events-to-inference-and-central.nwdaf-edge|mailto:flink@amf-events-to-inference-and-central.nwdaf-edge]:6123]]
>  Caused by: [The remote system explicitly disassociated (reason unknown).]}}
>  
> The reason for the issue is that the JobManager service port definitions are 
> not following the Istio guidelines 
> [https://istio.io/latest/docs/ops/configuration/traffic-management/protocol-selection/]
>  (see example below).
>  
> There was also an email discussion around this topic in the users mailing 
> group under the subject "Flink Kubernetes Operator with K8S + Istio + mTLS - 
> port definitions".
> With the help of the community, we were able to work around the issue but it 
> was very hard and forced us to skip Istio proxy which is not ideal.
>  
> We would like you to consider changing the default port definitions, either
>  # Rename the ports – I understand it is Istio specific guideline but maybe 
> it is better to at least be aligned with one (popular) vendor guideline 
> instead of none at all.
>  # Add the “appProtocol” property[1] that is not specific to any vendor but 
> requires Kubernetes >= 1.19 where it was introduced as beta and moved to 
> stable in >= 1.20. The option to add appProtocol property was added only in 
> [https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0] with 
> [#3570|https://github.com/fabric8io/kubernetes-client/issues/3570].
>  # Or allow a way to override the defaults.
>  
> [https://kubernetes.io/docs/concepts/services-networking/_print/#application-protocol]
>  
>  
> {{# k get service inference-results-to-analytics-engine -o yaml}}
> {{apiVersion: v1}}
> {{kind: Service}}
> {{...}}
> {{spec:}}
> {{  clusterIP: None}}
> {{  ports:}}
> {{  - name: jobmanager-rpc *# should start with “tcp-“ or add "appProtocol" 
> property*}}
> {{    port: 6123}}
> {{    protocol: TCP}}
> {{    targetPort: 6123}}
> {{  - name: blobserver *# should start with

[jira] [Updated] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

2023-04-11 Thread Sergio Sainz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Sainz updated FLINK-31775:
-
Description: 
When using native kubernetes deployment mode with high-availability (HA), and 
when new TaskManager pod is started to process a job, the TaskManager pod will 
attempt to register itself to the resource manager (JobManager). the 
TaskManager looks up the resource manager per ip-address 
(akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)

 

Nevertheless when istio is enabled, the resolution by ip address is blocked, 
and hence we see that the job cannot start because task manager cannot register 
with the resource manager:

2023-04-10 23:24:19,752 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
resolve ResourceManager address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
1 ms: Could not connect to rpc endpoint under address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.

 

Notice that when HA is disabled, the resolution of the resource manager is made 
by service name and so the resource manager can be found

 

2023-04-11 00:49:34,162 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
registration at resource manager 
akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_*
 under registration id 83ad942597f86aa880ee96f1c2b8b923.

 

Notice in my case , it is not possible to disable istio as explained here: 
[https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]

 

Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , logging 
as separate defect as I believe the fix of FLINK-28171 won't fix this case. 
FLINK-28171  is about Flink Kubernetes Operator and this is about native 
kubernetes deployment.

 

  was:
When using native kubernetes deployment mode with high-availability, and when 
new TaskManager pod is started to process a job, the TaskManager pod will 
attempt to register itself to the resource manager (JobManager). the 
TaskManager looks up the resource manager per ip-address 
(akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)

 

Nevertheless when istio is enabled, the resolution by ip address is blocked, 
and hence we see that the job cannot start because task manager cannot register 
with the resource manager:

2023-04-10 23:24:19,752 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
resolve ResourceManager address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
1 ms: Could not connect to rpc endpoint under address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.

 

Notice that when HA is disabled, the resolution of the resource manager is made 
by service name and so the resource manager can be found

 

2023-04-11 00:49:34,162 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
registration at resource manager 
akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_*
 under registration id 83ad942597f86aa880ee96f1c2b8b923.

 

Notice in my case , it is not possible to disable istio as explained here: 
[https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]

 

Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , logging 
as separate defect as I believe the fix of FLINK-28171 won't fix this case. 
FLINK-28171  is about Flink Kubernetes Operator and this is about native 
kubernetes deployment.

 


> High-Availability not supported in kubernetes when istio enabled
> 
>
> Key: FLINK-31775
> URL: https://issues.apache.org/jira/browse/FLINK-31775
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.16.1
>Reporter: Sergio Sainz
>Priority: Major
>
> When using native kubernetes deployment mode with high-availability (HA), and 
> when new TaskManager pod is started to process a job, the TaskManager pod 
> will attempt to register itself to the resource manager (JobManager). the 
> TaskManager looks up the resource manager per ip-address 
> (akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)
>  
> Nevertheless when istio is enabled, the resolution by ip address is blocked, 
> and hence we see that the job cannot start because task manager cannot 
> register with the resource manager:
> 2023-04-10 23:24:19,752 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
> resolve ResourceManager address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
> 1 ms: Could not connect to rpc endpoint under address 
>

[jira] [Updated] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

2023-04-11 Thread Sergio Sainz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Sainz updated FLINK-31775:
-
Description: 
When using native kubernetes deployment mode with high-availability, and when 
new TaskManager pod is started to process a job, the TaskManager pod will 
attempt to register itself to the resource manager (JobManager). the 
TaskManager looks up the resource manager per ip-address 
(akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)

 

Nevertheless when istio is enabled, the resolution by ip address is blocked, 
and hence we see that the job cannot start because task manager cannot register 
with the resource manager:

2023-04-10 23:24:19,752 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
resolve ResourceManager address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
1 ms: Could not connect to rpc endpoint under address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.

 

Notice that when HA is disabled, the resolution of the resource manager is made 
by service name and so the resource manager can be found

 

2023-04-11 00:49:34,162 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
registration at resource manager 
akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_*
 under registration id 83ad942597f86aa880ee96f1c2b8b923.

 

Notice in my case , it is not possible to disable istio as explained here: 
[https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]

 

Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , logging 
as separate defect as I believe the fix of FLINK-28171 won't fix this case. 
FLINK-28171  is about Flink Kubernetes Operator and this is about native 
kubernetes deployment.

 

  was:
When using native kubernetes deployment mode, and when new TaskManager pod is 
started to process a job, the TaskManager pod will attempt to register itself 
to the resource manager (JobManager). the TaskManager looks up the resource 
manager per ip-address 
(akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)

 

Nevertheless when istio is enabled, the resolution by ip address is blocked, 
and hence we see that the job cannot start because task manager cannot register 
with the resource manager:

2023-04-10 23:24:19,752 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
resolve ResourceManager address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
1 ms: Could not connect to rpc endpoint under address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.

 

Notice that when HA is disabled, the resolution of the resource manager is made 
by service name and so the resource manager can be found

 

2023-04-11 00:49:34,162 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
registration at resource manager 
akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_*
 under registration id 83ad942597f86aa880ee96f1c2b8b923.

 

Notice in my case , it is not possible to disable istio as explained here: 
[https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]

 

Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , logging 
as separate defect as I believe the fix of FLINK-28171 won't fix this case. 
FLINK-28171  is about Flink Kubernetes Operator and this is about native 
kubernetes deployment.

 


> High-Availability not supported in kubernetes when istio enabled
> 
>
> Key: FLINK-31775
> URL: https://issues.apache.org/jira/browse/FLINK-31775
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.16.1
>Reporter: Sergio Sainz
>Priority: Major
>
> When using native kubernetes deployment mode with high-availability, and when 
> new TaskManager pod is started to process a job, the TaskManager pod will 
> attempt to register itself to the resource manager (JobManager). the 
> TaskManager looks up the resource manager per ip-address 
> (akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)
>  
> Nevertheless when istio is enabled, the resolution by ip address is blocked, 
> and hence we see that the job cannot start because task manager cannot 
> register with the resource manager:
> 2023-04-10 23:24:19,752 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
> resolve ResourceManager address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
> 1 ms: Could not connect to rpc endpoint under address 
>

[jira] [Updated] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

2023-04-11 Thread Sergio Sainz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Sainz updated FLINK-31775:
-
Description: 
When using native kubernetes deployment mode, and when new TaskManager pod is 
started to process a job, the TaskManager pod will attempt to register itself 
to the resource manager (JobManager). the TaskManager looks up the resource 
manager per ip-address 
(akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)

 

Nevertheless when istio is enabled, the resolution by ip address is blocked, 
and hence we see that the job cannot start because task manager cannot register 
with the resource manager:

2023-04-10 23:24:19,752 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
resolve ResourceManager address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
1 ms: Could not connect to rpc endpoint under address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.

 

Notice that when HA is disabled, the resolution of the resource manager is made 
by service name and so the resource manager can be found

 

2023-04-11 00:49:34,162 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
registration at resource manager 
akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_*
 under registration id 83ad942597f86aa880ee96f1c2b8b923.

 

Notice in my case , it is not possible to disable istio as explained here: 
[https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]

 

Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , logging 
as separate defect as I believe the fix of FLINK-28171 won't fix this case. 
FLINK-28171  is about Flink Kubernetes Operator and this is about native 
kubernetes deployment.

 

  was:
When using native kubernetes deployment mode, and when new TaskManager pod is 
started to process a job, the TaskManager pod will attempt to register itself 
to the resource manager (JobManager). the TaskManager looks up the resource 
manager per ip-address 
(akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)

 

Nevertheless when istio is enabled, the resolution by ip address is blocked, 
and hence we see that the job cannot start because task manager cannot register 
with the resource manager:

2023-04-10 23:24:19,752 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
resolve ResourceManager address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
1 ms: Could not connect to rpc endpoint under address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.

 

Notice that when HA is disabled, the resolution of the resource manager is made 
by service name and so the resource manager can be found

 

2023-04-11 00:49:34,162 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
registration at resource manager 
akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_*
 under registration id 83ad942597f86aa880ee96f1c2b8b923.

 

Notice in my case , it is not possible to disable istio as explained here: 
[https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]

 

Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , logging 
as separate defect as I believe the fix of FLINK-28171 won't fix this case. 
FLINK-28171  is about Flink Kubernetes Operator.

 


> High-Availability not supported in kubernetes when istio enabled
> 
>
> Key: FLINK-31775
> URL: https://issues.apache.org/jira/browse/FLINK-31775
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.16.1
>Reporter: Sergio Sainz
>Priority: Major
>
> When using native kubernetes deployment mode, and when new TaskManager pod is 
> started to process a job, the TaskManager pod will attempt to register itself 
> to the resource manager (JobManager). the TaskManager looks up the resource 
> manager per ip-address 
> (akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)
>  
> Nevertheless when istio is enabled, the resolution by ip address is blocked, 
> and hence we see that the job cannot start because task manager cannot 
> register with the resource manager:
> 2023-04-10 23:24:19,752 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
> resolve ResourceManager address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
> 1 ms: Could not connect to rpc endpoint under address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.
>  
> Notice that when HA is disabled, the resolution of the resource manager

[jira] [Updated] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

2023-04-11 Thread Sergio Sainz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Sainz updated FLINK-31775:
-
Description: 
When using native kubernetes deployment mode, and when new TaskManager pod is 
started to process a job, the TaskManager pod will attempt to register itself 
to the resource manager (JobManager). the TaskManager looks up the resource 
manager per ip-address 
(akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)

 

Nevertheless when istio is enabled, the resolution by ip address is blocked, 
and hence we see that the job cannot start because task manager cannot register 
with the resource manager:

2023-04-10 23:24:19,752 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
resolve ResourceManager address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
1 ms: Could not connect to rpc endpoint under address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.

 

Notice that when HA is disabled, the resolution of the resource manager is made 
by service name and so the resource manager can be found

 

2023-04-11 00:49:34,162 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
registration at resource manager 
akka.tcp://flink@myenv-dev-flink-cluster.myenv-dev:6123/user/rpc/resourcemanager_*
 under registration id 83ad942597f86aa880ee96f1c2b8b923.

 

Notice in my case , it is not possible to disable istio as explained here: 
[https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html]

 

Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , logging 
as separate defect as I believe the fix of FLINK-28171 won't fix this case. 
FLINK-28171  is about Flink Kubernetes Operator.

 

  was:
When using native kubernetes deployment mode, and when new TaskManager is 
started to process a job, the TaskManager will attempt to register itself to 
the resource manager (job manager). the TaskManager looks up the resource 
manager per ip-address 
(akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)

 

Nevertheless when istio is enabled, the resolution by ip address is blocked, 
and hence we see that the job cannot start because task manager cannot register 
with the resource manager:

2023-04-10 23:24:19,752 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
resolve ResourceManager address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
1 ms: Could not connect to rpc endpoint under address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.

 

Notice that when HA is disabled, the resolution of the resource manager is made 
by service name and so the resource manager can be found

 

2023-04-11 00:49:34,162 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
registration at resource manager 
akka.tcp://fl...@local-mci-ar32a-dev-flink-cluster.mstr-env-mci-ar32a-dev:6123/user/rpc/resourcemanager_*
 under registration id 83ad942597f86aa880ee96f1c2b8b923.

 

Notice it is not possible to disable istio (as explained here : 
https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html)

 

Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , logging 
as separate defect as I believe the fix of FLINK-28171 won't fix this case. 
FLINK-28171  is about Flink Kubernetes Operator.


> High-Availability not supported in kubernetes when istio enabled
> 
>
> Key: FLINK-31775
> URL: https://issues.apache.org/jira/browse/FLINK-31775
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.16.1
>Reporter: Sergio Sainz
>Priority: Major
>
> When using native kubernetes deployment mode, and when new TaskManager pod is 
> started to process a job, the TaskManager pod will attempt to register itself 
> to the resource manager (JobManager). the TaskManager looks up the resource 
> manager per ip-address 
> (akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)
>  
> Nevertheless when istio is enabled, the resolution by ip address is blocked, 
> and hence we see that the job cannot start because task manager cannot 
> register with the resource manager:
> 2023-04-10 23:24:19,752 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
> resolve ResourceManager address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
> 1 ms: Could not connect to rpc endpoint under address 
> akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.
>  
> Notice that when HA is disabled, the resolution of the resource manager is 
> made by service name and so the resource

[jira] [Created] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

2023-04-11 Thread Sergio Sainz (Jira)

Sergio Sainz created FLINK-31775:


 Summary: High-Availability not supported in kubernetes when istio 
enabled
 Key: FLINK-31775
 URL: https://issues.apache.org/jira/browse/FLINK-31775
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes
Affects Versions: 1.16.1
Reporter: Sergio Sainz


When using native kubernetes deployment mode, and when new TaskManager is 
started to process a job, the TaskManager will attempt to register itself to 
the resource manager (job manager). the TaskManager looks up the resource 
manager per ip-address 
(akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1)

 

Nevertheless when istio is enabled, the resolution by ip address is blocked, 
and hence we see that the job cannot start because task manager cannot register 
with the resource manager:

2023-04-10 23:24:19,752 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could not 
resolve ResourceManager address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1, retrying in 
1 ms: Could not connect to rpc endpoint under address 
akka.tcp://flink@192.168.140.164:6123/user/rpc/resourcemanager_1.

 

Notice that when HA is disabled, the resolution of the resource manager is made 
by service name and so the resource manager can be found

 

2023-04-11 00:49:34,162 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Successful 
registration at resource manager 
akka.tcp://fl...@local-mci-ar32a-dev-flink-cluster.mstr-env-mci-ar32a-dev:6123/user/rpc/resourcemanager_*
 under registration id 83ad942597f86aa880ee96f1c2b8b923.

 

Notice it is not possible to disable istio (as explained here : 
https://doc.akka.io/docs/akka-management/current/bootstrap/istio.html)

 

Although similar to https://issues.apache.org/jira/browse/FLINK-28171 , logging 
as separate defect as I believe the fix of FLINK-28171 won't fix this case. 
FLINK-28171  is about Flink Kubernetes Operator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-28171) Adjust Job and Task manager port definitions to work with Istio+mTLS

2023-04-10 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-28171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17710375#comment-17710375
 ] 

Sergio Sainz edited comment on FLINK-28171 at 4/11/23 2:42 AM:
---

Hello [~wangyang0918] ! also experiencing this issue with native kubernetes 
deployment with HA enabled (not in flink k8s operator).

In [https://lists.apache.org/thread/yl40s9069wksz66qlf9t6jhmwsn59zft] you 
mentioned that "If HA enabled, the internal jobmanager rpc service will not be 
created. Instead, the TaskManager retrieves the JobManager address via HA 
services and connects to it via pod ip."

Do you know whether we can change the way TaskManager connects to HA services: 
do not use ip address and instead use pod name?

I think we cannot add the akka workaround of bypassing the istio sidecar. 
Thanks for the info ~

 


was (Author: sergiosp):
Hello [~wangyang0918] ! also experiencing this issue with native kubernetes 
deployment with HA enabled (not in flink k8s operator).

In [https://lists.apache.org/thread/yl40s9069wksz66qlf9t6jhmwsn59zft] you 
mentioned that "If HA enabled, the internal jobmanager rpc service will not be 
created. Instead, the TaskManager retrieves the JobManager address via HA 
services and connects to it via pod ip."

Do you know whether we can change the way TaskManager connects to HA services 
(do not use ip address and instead use service name?

I think we cannot add the akka workaround of bypassing the istio sidecar. 
Thanks for the info ~

 

> Adjust Job and Task manager port definitions to work with Istio+mTLS
> 
>
> Key: FLINK-28171
> URL: https://issues.apache.org/jira/browse/FLINK-28171
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.4
> Environment: flink-kubernetes-operator 1.0.0
> Flink 1.14-java11
> Kubernetes v1.19.5
> Istio 1.7.6
>Reporter: Moshe Elisha
>Assignee: Moshe Elisha
>Priority: Major
>  Labels: pull-request-available
>
> Hello,
>  
> We are launching Flink deployments using the [Flink Kubernetes 
> Operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-stable/]
>  on a Kubernetes cluster with Istio and mTLS enabled.
>  
> We found that the TaskManager is unable to communicate with the JobManager on 
> the jobmanager-rpc port:
>  
> {{2022-06-15 15:25:40,508 WARN  akka.remote.ReliableDeliverySupervisor        
>                [] - Association with remote system 
> [akka.tcp://[flink@amf-events-to-inference-and-central.nwdaf-edge|mailto:flink@amf-events-to-inference-and-central.nwdaf-edge]:6123]
>  has failed, address is now gated for [50] ms. Reason: [Association failed 
> with 
> [akka.tcp://[flink@amf-events-to-inference-and-central.nwdaf-edge|mailto:flink@amf-events-to-inference-and-central.nwdaf-edge]:6123]]
>  Caused by: [The remote system explicitly disassociated (reason unknown).]}}
>  
> The reason for the issue is that the JobManager service port definitions are 
> not following the Istio guidelines 
> [https://istio.io/latest/docs/ops/configuration/traffic-management/protocol-selection/]
>  (see example below).
>  
> There was also an email discussion around this topic in the users mailing 
> group under the subject "Flink Kubernetes Operator with K8S + Istio + mTLS - 
> port definitions".
> With the help of the community, we were able to work around the issue but it 
> was very hard and forced us to skip Istio proxy which is not ideal.
>  
> We would like you to consider changing the default port definitions, either
>  # Rename the ports – I understand it is Istio specific guideline but maybe 
> it is better to at least be aligned with one (popular) vendor guideline 
> instead of none at all.
>  # Add the “appProtocol” property[1] that is not specific to any vendor but 
> requires Kubernetes >= 1.19 where it was introduced as beta and moved to 
> stable in >= 1.20. The option to add appProtocol property was added only in 
> [https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0] with 
> [#3570|https://github.com/fabric8io/kubernetes-client/issues/3570].
>  # Or allow a way to override the defaults.
>  
> [https://kubernetes.io/docs/concepts/services-networking/_print/#application-protocol]
>  
>  
> {{# k get service inference-results-to-analytics-engine -o yaml}}
> {{apiVersion: v1}}
> {{kind: Service}}
> {{...}}
> {{spec:}}
> {{  clusterIP: None}}
> {{  ports:}}
> {{  - name: jobmanager-rpc *# should start with “tcp-“ or add "appProtocol" 
> property*}}
> {{    port: 6123}}
> {{    protocol: TCP}}
> {{    targetPort: 6123}}
> {{  - name: blobserver *# should start with "tcp-" or add "appProtocol" 
> property*}}
> {{    port: 6124}}
> {{    protocol: TCP}}
> {{    targetPort:

[jira] [Commented] (FLINK-28171) Adjust Job and Task manager port definitions to work with Istio+mTLS

2023-04-10 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-28171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17710375#comment-17710375
 ] 

Sergio Sainz commented on FLINK-28171:
--

Hello [~wangyang0918] ! also experiencing this issue with native kubernetes 
deployment with HA enabled (not in flink k8s operator).

In [https://lists.apache.org/thread/yl40s9069wksz66qlf9t6jhmwsn59zft] you 
mentioned that "If HA enabled, the internal jobmanager rpc service will not be 
created. Instead, the TaskManager retrieves the JobManager address via HA 
services and connects to it via pod ip."

Do you know whether we can change the way TaskManager connects to HA services 
(do not use ip address and instead use service name?

I think we cannot add the akka workaround of bypassing the istio sidecar. 
Thanks for the info ~

 

> Adjust Job and Task manager port definitions to work with Istio+mTLS
> 
>
> Key: FLINK-28171
> URL: https://issues.apache.org/jira/browse/FLINK-28171
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes
>Affects Versions: 1.14.4
> Environment: flink-kubernetes-operator 1.0.0
> Flink 1.14-java11
> Kubernetes v1.19.5
> Istio 1.7.6
>Reporter: Moshe Elisha
>Assignee: Moshe Elisha
>Priority: Major
>  Labels: pull-request-available
>
> Hello,
>  
> We are launching Flink deployments using the [Flink Kubernetes 
> Operator|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-stable/]
>  on a Kubernetes cluster with Istio and mTLS enabled.
>  
> We found that the TaskManager is unable to communicate with the JobManager on 
> the jobmanager-rpc port:
>  
> {{2022-06-15 15:25:40,508 WARN  akka.remote.ReliableDeliverySupervisor        
>                [] - Association with remote system 
> [akka.tcp://[flink@amf-events-to-inference-and-central.nwdaf-edge|mailto:flink@amf-events-to-inference-and-central.nwdaf-edge]:6123]
>  has failed, address is now gated for [50] ms. Reason: [Association failed 
> with 
> [akka.tcp://[flink@amf-events-to-inference-and-central.nwdaf-edge|mailto:flink@amf-events-to-inference-and-central.nwdaf-edge]:6123]]
>  Caused by: [The remote system explicitly disassociated (reason unknown).]}}
>  
> The reason for the issue is that the JobManager service port definitions are 
> not following the Istio guidelines 
> [https://istio.io/latest/docs/ops/configuration/traffic-management/protocol-selection/]
>  (see example below).
>  
> There was also an email discussion around this topic in the users mailing 
> group under the subject "Flink Kubernetes Operator with K8S + Istio + mTLS - 
> port definitions".
> With the help of the community, we were able to work around the issue but it 
> was very hard and forced us to skip Istio proxy which is not ideal.
>  
> We would like you to consider changing the default port definitions, either
>  # Rename the ports – I understand it is Istio specific guideline but maybe 
> it is better to at least be aligned with one (popular) vendor guideline 
> instead of none at all.
>  # Add the “appProtocol” property[1] that is not specific to any vendor but 
> requires Kubernetes >= 1.19 where it was introduced as beta and moved to 
> stable in >= 1.20. The option to add appProtocol property was added only in 
> [https://github.com/fabric8io/kubernetes-client/releases/tag/v5.10.0] with 
> [#3570|https://github.com/fabric8io/kubernetes-client/issues/3570].
>  # Or allow a way to override the defaults.
>  
> [https://kubernetes.io/docs/concepts/services-networking/_print/#application-protocol]
>  
>  
> {{# k get service inference-results-to-analytics-engine -o yaml}}
> {{apiVersion: v1}}
> {{kind: Service}}
> {{...}}
> {{spec:}}
> {{  clusterIP: None}}
> {{  ports:}}
> {{  - name: jobmanager-rpc *# should start with “tcp-“ or add "appProtocol" 
> property*}}
> {{    port: 6123}}
> {{    protocol: TCP}}
> {{    targetPort: 6123}}
> {{  - name: blobserver *# should start with "tcp-" or add "appProtocol" 
> property*}}
> {{    port: 6124}}
> {{    protocol: TCP}}
> {{    targetPort: 6124}}
> {{...}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-31685) Checkpoint job folder not deleted after job is cancelled

2023-03-31 Thread Sergio Sainz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Sainz updated FLINK-31685:
-
Description: 
When flink job is being checkpointed, and after the job is cancelled, the 
checkpoint is indeed deleted (as per 
{{{}execution.checkpointing.externalized-checkpoint-retention: 
DELETE_ON_CANCELLATION{}}}), but the job-id folder still remains:
 
[sergio@flink-cluster-54f7fc7c6-k6km8 JobCheckpoints]$ ls
01eff17aa2910484b5aeb644bc531172  3a59309ef018541fc0c20856d0d89855  
78ff2344dd7ef89f9fbcc9789fc0cd79  a6fd7cec89c0af78c3353d4a46a7d273  
dbc957868c08ebeb100d708bbd057593
04ff0abb9e860fc85f0e39d722367c3c  3e09166341615b1b4786efd6745a05d6  
79efc000aa29522f0a9598661f485f67  a8c42bfe158abd78ebcb4adb135de61f  
dc8e04b02c9d8a1bc04b21d2c8f21f74
05f48019475de40230900230c63cfe89  3f9fb467c9af91ef41d527fe92f9b590  
7a6ad7407d7120eda635d71cd843916a  a8db748c1d329407405387ac82040be4  
dfb2df1c25056e920d41c94b659dcdab
09d30bc0ff786994a6a3bb06abd3  455525b76a1c6826b6eaebd5649c5b6b  
7b1458424496baaf3d020e9fece525a4  aa2ef9587b2e9c123744e8940a66a287

All folders in the above list, like {{01eff17aa2910484b5aeb644bc531172}}  , are 
empty ~

 

*Expected behaviour:*

The job folder id should also be deleted.

  was:
When flink job is being checkpointed, and after the job is cancelled, the 
checkpoint is indeed deleted (as per 
{{{}execution.checkpointing.externalized-checkpoint-retention: 
DELETE_ON_CANCELLATION{}}}), but the job-id folder still remains:
 
sergio@flink-cluster-54f7fc7c6-k6km8 JobCheckpoints]$ ls
01eff17aa2910484b5aeb644bc531172  3a59309ef018541fc0c20856d0d89855  
78ff2344dd7ef89f9fbcc9789fc0cd79  a6fd7cec89c0af78c3353d4a46a7d273  
dbc957868c08ebeb100d708bbd057593
04ff0abb9e860fc85f0e39d722367c3c  3e09166341615b1b4786efd6745a05d6  
79efc000aa29522f0a9598661f485f67  a8c42bfe158abd78ebcb4adb135de61f  
dc8e04b02c9d8a1bc04b21d2c8f21f74
05f48019475de40230900230c63cfe89  3f9fb467c9af91ef41d527fe92f9b590  
7a6ad7407d7120eda635d71cd843916a  a8db748c1d329407405387ac82040be4  
dfb2df1c25056e920d41c94b659dcdab
09d30bc0ff786994a6a3bb06abd3  455525b76a1c6826b6eaebd5649c5b6b  
7b1458424496baaf3d020e9fece525a4  aa2ef9587b2e9c123744e8940a66a287

All folders in the above list, like {{01eff17aa2910484b5aeb644bc531172}}  , are 
empty ~

 

*Expected behaviour:*

The job folder id should also be deleted.


> Checkpoint job folder not deleted after job is cancelled
> 
>
> Key: FLINK-31685
> URL: https://issues.apache.org/jira/browse/FLINK-31685
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.1
>Reporter: Sergio Sainz
>Priority: Major
>
> When flink job is being checkpointed, and after the job is cancelled, the 
> checkpoint is indeed deleted (as per 
> {{{}execution.checkpointing.externalized-checkpoint-retention: 
> DELETE_ON_CANCELLATION{}}}), but the job-id folder still remains:
>  
> [sergio@flink-cluster-54f7fc7c6-k6km8 JobCheckpoints]$ ls
> 01eff17aa2910484b5aeb644bc531172  3a59309ef018541fc0c20856d0d89855  
> 78ff2344dd7ef89f9fbcc9789fc0cd79  a6fd7cec89c0af78c3353d4a46a7d273  
> dbc957868c08ebeb100d708bbd057593
> 04ff0abb9e860fc85f0e39d722367c3c  3e09166341615b1b4786efd6745a05d6  
> 79efc000aa29522f0a9598661f485f67  a8c42bfe158abd78ebcb4adb135de61f  
> dc8e04b02c9d8a1bc04b21d2c8f21f74
> 05f48019475de40230900230c63cfe89  3f9fb467c9af91ef41d527fe92f9b590  
> 7a6ad7407d7120eda635d71cd843916a  a8db748c1d329407405387ac82040be4  
> dfb2df1c25056e920d41c94b659dcdab
> 09d30bc0ff786994a6a3bb06abd3  455525b76a1c6826b6eaebd5649c5b6b  
> 7b1458424496baaf3d020e9fece525a4  aa2ef9587b2e9c123744e8940a66a287
> All folders in the above list, like {{01eff17aa2910484b5aeb644bc531172}}  , 
> are empty ~
>  
> *Expected behaviour:*
> The job folder id should also be deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-31685) Checkpoint job folder not deleted after job is cancelled

2023-03-31 Thread Sergio Sainz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-31685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Sainz updated FLINK-31685:
-
Description: 
When flink job is being checkpointed, and after the job is cancelled, the 
checkpoint is indeed deleted (as per 
{{{}execution.checkpointing.externalized-checkpoint-retention: 
DELETE_ON_CANCELLATION{}}}), but the job-id folder still remains:
 
sergio@flink-cluster-54f7fc7c6-k6km8 JobCheckpoints]$ ls
01eff17aa2910484b5aeb644bc531172  3a59309ef018541fc0c20856d0d89855  
78ff2344dd7ef89f9fbcc9789fc0cd79  a6fd7cec89c0af78c3353d4a46a7d273  
dbc957868c08ebeb100d708bbd057593
04ff0abb9e860fc85f0e39d722367c3c  3e09166341615b1b4786efd6745a05d6  
79efc000aa29522f0a9598661f485f67  a8c42bfe158abd78ebcb4adb135de61f  
dc8e04b02c9d8a1bc04b21d2c8f21f74
05f48019475de40230900230c63cfe89  3f9fb467c9af91ef41d527fe92f9b590  
7a6ad7407d7120eda635d71cd843916a  a8db748c1d329407405387ac82040be4  
dfb2df1c25056e920d41c94b659dcdab
09d30bc0ff786994a6a3bb06abd3  455525b76a1c6826b6eaebd5649c5b6b  
7b1458424496baaf3d020e9fece525a4  aa2ef9587b2e9c123744e8940a66a287

All folders in the above list, like {{01eff17aa2910484b5aeb644bc531172}}  , are 
empty ~

 

*Expected behaviour:*

The job folder id should also be deleted.

  was:
When flink job is being checkpointed, and after the job is cancelled, the 
checkpoint is indeed deleted (as per 
{{{}execution.checkpointing.externalized-checkpoint-retention: 
DELETE_ON_CANCELLATION{}}}), but the job-id folder still remains:
 {color:var(--ds-text, #172b4d)}{{{color:var(--ds-text-subtlest, #505f79) 
}1{color}[sergio@flink-cluster-54f7fc7c6-k6km8 JobCheckpoints]$ 
ls{color:var(--ds-text-subtlest, #505f79) 
}2{color}01eff17aa2910484b5aeb644bc531172  3a59309ef018541fc0c20856d0d89855  
78ff2344dd7ef89f9fbcc9789fc0cd79  a6fd7cec89c0af78c3353d4a46a7d273  
dbc957868c08ebeb100d708bbd057593{color:var(--ds-text-subtlest, #505f79) 
}3{color}04ff0abb9e860fc85f0e39d722367c3c  3e09166341615b1b4786efd6745a05d6  
79efc000aa29522f0a9598661f485f67  a8c42bfe158abd78ebcb4adb135de61f  
dc8e04b02c9d8a1bc04b21d2c8f21f74{color:var(--ds-text-subtlest, #505f79) 
}4{color}05f48019475de40230900230c63cfe89  3f9fb467c9af91ef41d527fe92f9b590  
7a6ad7407d7120eda635d71cd843916a  a8db748c1d329407405387ac82040be4  
dfb2df1c25056e920d41c94b659dcdab{color:var(--ds-text-subtlest, #505f79) 
}5{color}09d30bc0ff786994a6a3bb06abd3  455525b76a1c6826b6eaebd5649c5b6b  
7b1458424496baaf3d020e9fece525a4  aa2ef9587b2e9c123744e8940a66a287}}{color}
All folders in the above list, like {{01eff17aa2910484b5aeb644bc531172}}  , are 
empty ~

 

*Expected behaviour:*

The job folder id should also be deleted.


> Checkpoint job folder not deleted after job is cancelled
> 
>
> Key: FLINK-31685
> URL: https://issues.apache.org/jira/browse/FLINK-31685
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.1
>Reporter: Sergio Sainz
>Priority: Major
>
> When flink job is being checkpointed, and after the job is cancelled, the 
> checkpoint is indeed deleted (as per 
> {{{}execution.checkpointing.externalized-checkpoint-retention: 
> DELETE_ON_CANCELLATION{}}}), but the job-id folder still remains:
>  
> sergio@flink-cluster-54f7fc7c6-k6km8 JobCheckpoints]$ ls
> 01eff17aa2910484b5aeb644bc531172  3a59309ef018541fc0c20856d0d89855  
> 78ff2344dd7ef89f9fbcc9789fc0cd79  a6fd7cec89c0af78c3353d4a46a7d273  
> dbc957868c08ebeb100d708bbd057593
> 04ff0abb9e860fc85f0e39d722367c3c  3e09166341615b1b4786efd6745a05d6  
> 79efc000aa29522f0a9598661f485f67  a8c42bfe158abd78ebcb4adb135de61f  
> dc8e04b02c9d8a1bc04b21d2c8f21f74
> 05f48019475de40230900230c63cfe89  3f9fb467c9af91ef41d527fe92f9b590  
> 7a6ad7407d7120eda635d71cd843916a  a8db748c1d329407405387ac82040be4  
> dfb2df1c25056e920d41c94b659dcdab
> 09d30bc0ff786994a6a3bb06abd3  455525b76a1c6826b6eaebd5649c5b6b  
> 7b1458424496baaf3d020e9fece525a4  aa2ef9587b2e9c123744e8940a66a287
> All folders in the above list, like {{01eff17aa2910484b5aeb644bc531172}}  , 
> are empty ~
>  
> *Expected behaviour:*
> The job folder id should also be deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-31685) Checkpoint job folder not deleted after job is cancelled

2023-03-31 Thread Sergio Sainz (Jira)

Sergio Sainz created FLINK-31685:


 Summary: Checkpoint job folder not deleted after job is cancelled
 Key: FLINK-31685
 URL: https://issues.apache.org/jira/browse/FLINK-31685
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.16.1
Reporter: Sergio Sainz


When flink job is being checkpointed, and after the job is cancelled, the 
checkpoint is indeed deleted (as per 
{{{}execution.checkpointing.externalized-checkpoint-retention: 
DELETE_ON_CANCELLATION{}}}), but the job-id folder still remains:
 {color:var(--ds-text, #172b4d)}{{{color:var(--ds-text-subtlest, #505f79) 
}1{color}[sergio@flink-cluster-54f7fc7c6-k6km8 JobCheckpoints]$ 
ls{color:var(--ds-text-subtlest, #505f79) 
}2{color}01eff17aa2910484b5aeb644bc531172  3a59309ef018541fc0c20856d0d89855  
78ff2344dd7ef89f9fbcc9789fc0cd79  a6fd7cec89c0af78c3353d4a46a7d273  
dbc957868c08ebeb100d708bbd057593{color:var(--ds-text-subtlest, #505f79) 
}3{color}04ff0abb9e860fc85f0e39d722367c3c  3e09166341615b1b4786efd6745a05d6  
79efc000aa29522f0a9598661f485f67  a8c42bfe158abd78ebcb4adb135de61f  
dc8e04b02c9d8a1bc04b21d2c8f21f74{color:var(--ds-text-subtlest, #505f79) 
}4{color}05f48019475de40230900230c63cfe89  3f9fb467c9af91ef41d527fe92f9b590  
7a6ad7407d7120eda635d71cd843916a  a8db748c1d329407405387ac82040be4  
dfb2df1c25056e920d41c94b659dcdab{color:var(--ds-text-subtlest, #505f79) 
}5{color}09d30bc0ff786994a6a3bb06abd3  455525b76a1c6826b6eaebd5649c5b6b  
7b1458424496baaf3d020e9fece525a4  aa2ef9587b2e9c123744e8940a66a287}}{color}
All folders in the above list, like {{01eff17aa2910484b5aeb644bc531172}}  , are 
empty ~

 

*Expected behaviour:*

The job folder id should also be deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-15736) Support Java 17 (LTS)

2023-03-07 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17697544#comment-17697544
 ] 

Sergio Sainz commented on FLINK-15736:
--

Hi [~chesnay] , saw in June last year there was no timeline to implement the 
JDK17 support. Just kindly checking if there is a change on the timeline? as 
there has been some more investigation on this ticket and the different related 
bugs found by JDK17 upgrade. Thanks!

> Support Java 17 (LTS)
> -
>
> Key: FLINK-15736
> URL: https://issues.apache.org/jira/browse/FLINK-15736
> Project: Flink
>  Issue Type: New Feature
>  Components: Build System
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: auto-deprioritized-major, pull-request-available, 
> stale-assigned
>
> Long-term issue for preparing Flink for Java 17.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-31247) Support for fine-grained resource allocation (slot sharing groups) at the TableAPI level

2023-02-27 Thread Sergio Sainz (Jira)

Sergio Sainz created FLINK-31247:


 Summary: Support for fine-grained resource allocation (slot 
sharing groups) at the TableAPI level
 Key: FLINK-31247
 URL: https://issues.apache.org/jira/browse/FLINK-31247
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Coordination
Affects Versions: 1.16.1
Reporter: Sergio Sainz


Currently Flink allows for fine-grained resource allocation at the DataStream 
API level (please see 
[https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/finegrained_resource/)]

 

This ticket is an enhancement request to support the same API at the Table 
level:

 

org.apache.flink.table.api.Table.setSlotSharingGroup(String ssg)

org.apache.flink.table.api.Table.setSlotSharingGroup(SlotSharingGroup ssg)

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29235) CVE-2022-25857 on flink-shaded

2022-11-07 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17629998#comment-17629998
 ] 

Sergio Sainz commented on FLINK-29235:
--

Hi , could we evaluate for addition in 1.16.1?

> CVE-2022-25857 on flink-shaded
> --
>
> Key: FLINK-29235
> URL: https://issues.apache.org/jira/browse/FLINK-29235
> Project: Flink
>  Issue Type: Bug
>  Components: Build System, BuildSystem / Shaded
>Affects Versions: 1.15.2
>Reporter: Sergio Sainz
>Assignee: Chesnay Schepler
>Priority: Major
> Fix For: 1.17.0
>
>
> flink-shaded-version uses snakeyaml v1.29 which is vulnerable to 
> CVE-2022-25857
> Ref:
> https://nvd.nist.gov/vuln/detail/CVE-2022-25857
> https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-jackson/2.12.4-15.0/flink-shaded-jackson-2.12.4-15.0.pom
> https://github.com/apache/flink-shaded/blob/master/flink-shaded-jackson-parent/flink-shaded-jackson-2/pom.xml#L73



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29892) flink-conf.yaml does not accept hash (#) in the env.java.opts property

2022-11-04 Thread Sergio Sainz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-29892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Sainz updated FLINK-29892:
-
Description: 
When adding a string with hash (#) character in env.java.opts in 
flink-conf.yaml , the string will be truncated from the # onwards even when the 
value is surrounded by single quotes or double quotes.

example:

(in flink-conf.yaml):

env.java.opts: "-Djavax.net.ssl.trustStorePassword=my#pwd"

 

the value shown on the flink taskmanagers or job managers is :

env.java.opts: -Djavax.net.ssl.trustStorePassword=my

 

 

  was:
When adding a string with hash (#) character in env.java.opts in 
flink-conf.yaml , the string will be truncated from the # onwards even when the 
value is surrounded by single quotes or double quotes.

example:

(in flink-conf.yaml):

env.java.opts: -Djavax.net.ssl.trustStorePassword=my#pwd

 

the value shown on the flink taskmanagers or job managers is :

env.java.opts: -Djavax.net.ssl.trustStorePassword=my

 

 


> flink-conf.yaml does not accept hash (#) in the env.java.opts property
> --
>
> Key: FLINK-29892
> URL: https://issues.apache.org/jira/browse/FLINK-29892
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.15.2
>Reporter: Sergio Sainz
>Priority: Major
>
> When adding a string with hash (#) character in env.java.opts in 
> flink-conf.yaml , the string will be truncated from the # onwards even when 
> the value is surrounded by single quotes or double quotes.
> example:
> (in flink-conf.yaml):
> env.java.opts: "-Djavax.net.ssl.trustStorePassword=my#pwd"
>  
> the value shown on the flink taskmanagers or job managers is :
> env.java.opts: -Djavax.net.ssl.trustStorePassword=my
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-29892) flink-conf.yaml does not accept hash (#) in the env.java.opts property

2022-11-04 Thread Sergio Sainz (Jira)

Sergio Sainz created FLINK-29892:


 Summary: flink-conf.yaml does not accept hash (#) in the 
env.java.opts property
 Key: FLINK-29892
 URL: https://issues.apache.org/jira/browse/FLINK-29892
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes
Affects Versions: 1.15.2
Reporter: Sergio Sainz


When adding a string with hash (#) character in env.java.opts in 
flink-conf.yaml , the string will be truncated from the # onwards even when the 
value is surrounded by single quotes or double quotes.

example:

(in flink-conf.yaml):

env.java.opts: -Djavax.net.ssl.trustStorePassword=my#pwd

 

the value shown on the flink taskmanagers or job managers is :

env.java.opts: -Djavax.net.ssl.trustStorePassword=my

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-29235) CVE-2022-25857 on flink-shaded

2022-10-24 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17623369#comment-17623369
 ] 

Sergio Sainz edited comment on FLINK-29235 at 10/24/22 7:45 PM:


Hello [~chesnay] 

Noticed the flink-shaded-jackson v2.13.4-16.0 already has the fix (it uses 
jackson's own snakeyaml version which is 1.31). Could we upgrade flink-shaded 
version in flink version 1.16.0 to use 2.13.4-16.0?

[https://github.com/apache/flink/blob/release-1.16.0-rc2/pom.xml#L125]


{{16.0    
}}{{2.13.4}}

 

Ref:

[https://repo1.maven.org/maven2/com/fasterxml/jackson/dataformat/jackson-dataformat-yaml/2.13.4/jackson-dataformat-yaml-2.13.4.pom]

[https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-jackson/2.13.4-16.0/flink-shaded-jackson-2.13.4-16.0.pom]

 


was (Author: sergiosp):
Hello [~chesnay] 

Noticed the flink-shaded-jackson v2.13.4-16.0 already has the fix (it uses 
jackson's own snakeyaml version which is 1.31). Could we upgrade flink-shaded 
version in flink version 1.16.0 to use 2.13.4-16.0?

[https://github.com/apache/flink/blob/release-1.16.0-rc2/pom.xml#L125]

```

        16.0
        2.13.4

```

 

ref:

[https://repo1.maven.org/maven2/com/fasterxml/jackson/dataformat/jackson-dataformat-yaml/2.13.4/jackson-dataformat-yaml-2.13.4.pom]

[https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-jackson/2.13.4-16.0/flink-shaded-jackson-2.13.4-16.0.pom]

 

> CVE-2022-25857 on flink-shaded
> --
>
> Key: FLINK-29235
> URL: https://issues.apache.org/jira/browse/FLINK-29235
> Project: Flink
>  Issue Type: Bug
>  Components: Build System, BuildSystem / Shaded
>Affects Versions: 1.15.2
>Reporter: Sergio Sainz
>Assignee: Chesnay Schepler
>Priority: Major
>
> flink-shaded-version uses snakeyaml v1.29 which is vulnerable to 
> CVE-2022-25857
> Ref:
> https://nvd.nist.gov/vuln/detail/CVE-2022-25857
> https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-jackson/2.12.4-15.0/flink-shaded-jackson-2.12.4-15.0.pom
> https://github.com/apache/flink-shaded/blob/master/flink-shaded-jackson-parent/flink-shaded-jackson-2/pom.xml#L73



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-29235) CVE-2022-25857 on flink-shaded

2022-10-24 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17623369#comment-17623369
 ] 

Sergio Sainz edited comment on FLINK-29235 at 10/24/22 7:45 PM:


Hello [~chesnay] 

Noticed the flink-shaded-jackson v2.13.4-16.0 already has the fix (it uses 
jackson's own snakeyaml version which is 1.31). Could we upgrade flink-shaded 
version in flink version 1.16.0 to use 2.13.4-16.0?

[https://github.com/apache/flink/blob/release-1.16.0-rc2/pom.xml#L125]

{{16.0    
2.13.4}}

 

Ref:

[https://repo1.maven.org/maven2/com/fasterxml/jackson/dataformat/jackson-dataformat-yaml/2.13.4/jackson-dataformat-yaml-2.13.4.pom]

[https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-jackson/2.13.4-16.0/flink-shaded-jackson-2.13.4-16.0.pom]

 


was (Author: sergiosp):
Hello [~chesnay] 

Noticed the flink-shaded-jackson v2.13.4-16.0 already has the fix (it uses 
jackson's own snakeyaml version which is 1.31). Could we upgrade flink-shaded 
version in flink version 1.16.0 to use 2.13.4-16.0?

[https://github.com/apache/flink/blob/release-1.16.0-rc2/pom.xml#L125]


{{16.0    
}}{{2.13.4}}

 

Ref:

[https://repo1.maven.org/maven2/com/fasterxml/jackson/dataformat/jackson-dataformat-yaml/2.13.4/jackson-dataformat-yaml-2.13.4.pom]

[https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-jackson/2.13.4-16.0/flink-shaded-jackson-2.13.4-16.0.pom]

 

> CVE-2022-25857 on flink-shaded
> --
>
> Key: FLINK-29235
> URL: https://issues.apache.org/jira/browse/FLINK-29235
> Project: Flink
>  Issue Type: Bug
>  Components: Build System, BuildSystem / Shaded
>Affects Versions: 1.15.2
>Reporter: Sergio Sainz
>Assignee: Chesnay Schepler
>Priority: Major
>
> flink-shaded-version uses snakeyaml v1.29 which is vulnerable to 
> CVE-2022-25857
> Ref:
> https://nvd.nist.gov/vuln/detail/CVE-2022-25857
> https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-jackson/2.12.4-15.0/flink-shaded-jackson-2.12.4-15.0.pom
> https://github.com/apache/flink-shaded/blob/master/flink-shaded-jackson-parent/flink-shaded-jackson-2/pom.xml#L73



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29235) CVE-2022-25857 on flink-shaded

2022-10-24 Thread Sergio Sainz (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-29235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17623369#comment-17623369
 ] 

Sergio Sainz commented on FLINK-29235:
--

Hello [~chesnay] 

Noticed the flink-shaded-jackson v2.13.4-16.0 already has the fix (it uses 
jackson's own snakeyaml version which is 1.31). Could we upgrade flink-shaded 
version in flink version 1.16.0 to use 2.13.4-16.0?

[https://github.com/apache/flink/blob/release-1.16.0-rc2/pom.xml#L125]

```

        16.0
        2.13.4

```

 

ref:

[https://repo1.maven.org/maven2/com/fasterxml/jackson/dataformat/jackson-dataformat-yaml/2.13.4/jackson-dataformat-yaml-2.13.4.pom]

[https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-jackson/2.13.4-16.0/flink-shaded-jackson-2.13.4-16.0.pom]

 

> CVE-2022-25857 on flink-shaded
> --
>
> Key: FLINK-29235
> URL: https://issues.apache.org/jira/browse/FLINK-29235
> Project: Flink
>  Issue Type: Bug
>  Components: Build System, BuildSystem / Shaded
>Affects Versions: 1.15.2
>Reporter: Sergio Sainz
>Assignee: Chesnay Schepler
>Priority: Major
>
> flink-shaded-version uses snakeyaml v1.29 which is vulnerable to 
> CVE-2022-25857
> Ref:
> https://nvd.nist.gov/vuln/detail/CVE-2022-25857
> https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-jackson/2.12.4-15.0/flink-shaded-jackson-2.12.4-15.0.pom
> https://github.com/apache/flink-shaded/blob/master/flink-shaded-jackson-parent/flink-shaded-jackson-2/pom.xml#L73



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-29631) [CVE-2022-42003] flink-shaded-jackson

2022-10-13 Thread Sergio Sainz (Jira)

Sergio Sainz created FLINK-29631:


 Summary: [CVE-2022-42003] flink-shaded-jackson
 Key: FLINK-29631
 URL: https://issues.apache.org/jira/browse/FLINK-29631
 Project: Flink
  Issue Type: Bug
  Components: BuildSystem / Shaded
Affects Versions: shaded-16.0
Reporter: Sergio Sainz


flink-shaded-jackson vulnerable to 7.5 (high) 
[https://nvd.nist.gov/vuln/detail/CVE-2022-42003]

 

Ref:

[https://nvd.nist.gov/vuln/detail/CVE-2022-42003]

[https://repo1.maven.org/maven2/org/apache/flink/flink-shaded/16.0/flink-shaded-16.0.pom]

[https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-jackson-parent/2.13.4-16.0]

[https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-jackson/2.13.4-16.0/flink-shaded-jackson-2.13.4-16.0.pom]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-29235) CVE-2022-25857 on flink-shaded

2022-09-08 Thread Sergio Sainz (Jira)

Sergio Sainz created FLINK-29235:


 Summary: CVE-2022-25857 on flink-shaded
 Key: FLINK-29235
 URL: https://issues.apache.org/jira/browse/FLINK-29235
 Project: Flink
  Issue Type: Bug
  Components: BuildSystem / Shaded
Affects Versions: 1.15.2
Reporter: Sergio Sainz


flink-shaded-version uses snakeyaml v1.29 which is vulnerable to CVE-2022-25857

Ref:

https://nvd.nist.gov/vuln/detail/CVE-2022-25857

https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-jackson/2.12.4-15.0/flink-shaded-jackson-2.12.4-15.0.pom

https://github.com/apache/flink-shaded/blob/master/flink-shaded-jackson-parent/flink-shaded-jackson-2/pom.xml#L73



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-31974) JobManager crashes after KubernetesClientException exception with FatalExitExceptionHandler

[jira] [Created] (FLINK-31978) flink-connector-jdbc v.1.17.0 not published

[jira] [Created] (FLINK-31974) JobManager crashes after KubernetesClientException exception with FatalExitExceptionHandler

[jira] [Updated] (FLINK-31796) Support service mesh istio with Flink kubernetes (both native and operator) for secure communications

[jira] [Created] (FLINK-31796) Support service mesh istio with Flink kubernetes (both native and operator) for secure communications

[jira] [Comment Edited] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

[jira] [Commented] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

[jira] [Comment Edited] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

[jira] [Comment Edited] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

[jira] [Commented] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

[jira] [Comment Edited] (FLINK-28171) Adjust Job and Task manager port definitions to work with Istio+mTLS

[jira] [Updated] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

[jira] [Updated] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

[jira] [Updated] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

[jira] [Updated] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

[jira] [Created] (FLINK-31775) High-Availability not supported in kubernetes when istio enabled

[jira] [Comment Edited] (FLINK-28171) Adjust Job and Task manager port definitions to work with Istio+mTLS

[jira] [Commented] (FLINK-28171) Adjust Job and Task manager port definitions to work with Istio+mTLS

[jira] [Updated] (FLINK-31685) Checkpoint job folder not deleted after job is cancelled

[jira] [Updated] (FLINK-31685) Checkpoint job folder not deleted after job is cancelled

[jira] [Created] (FLINK-31685) Checkpoint job folder not deleted after job is cancelled

[jira] [Commented] (FLINK-15736) Support Java 17 (LTS)

[jira] [Created] (FLINK-31247) Support for fine-grained resource allocation (slot sharing groups) at the TableAPI level

[jira] [Commented] (FLINK-29235) CVE-2022-25857 on flink-shaded

[jira] [Updated] (FLINK-29892) flink-conf.yaml does not accept hash (#) in the env.java.opts property

[jira] [Created] (FLINK-29892) flink-conf.yaml does not accept hash (#) in the env.java.opts property

[jira] [Comment Edited] (FLINK-29235) CVE-2022-25857 on flink-shaded

[jira] [Comment Edited] (FLINK-29235) CVE-2022-25857 on flink-shaded

[jira] [Commented] (FLINK-29235) CVE-2022-25857 on flink-shaded

[jira] [Created] (FLINK-29631) [CVE-2022-42003] flink-shaded-jackson

[jira] [Created] (FLINK-29235) CVE-2022-25857 on flink-shaded

31 matches

Site Navigation

Mail list logo

Footer information