[jira] [Created] (FLINK-34262) Support to set user specified labels for Rest Service Object

2024-01-29 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-34262:
-

 Summary: Support to set user specified labels for Rest Service 
Object
 Key: FLINK-34262
 URL: https://issues.apache.org/jira/browse/FLINK-34262
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Kubernetes
Affects Versions: 1.18.1
Reporter: Prabhu Joseph


Flink allows users to label JM and TM pods; the rest service object also 
requires labeling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34142) TaskManager WorkingDirectory is not removed during shutdown

2024-01-18 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-34142:
-

 Summary: TaskManager WorkingDirectory is not removed during 
shutdown 
 Key: FLINK-34142
 URL: https://issues.apache.org/jira/browse/FLINK-34142
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Affects Versions: 1.17.1, 1.16.0
Reporter: Prabhu Joseph


TaskManager WorkingDirectory is not removed during shutdown. 

*Repro*

 
{code:java}
1. Execute a Flink batch job within a Flink on YARN Session

flink-yarn-session -d

flink run -d /usr/lib/flink/examples/batch/WordCount.jar --input 
s3://prabhuflinks3/INPUT --output s3://prabhuflinks3/OUT

{code}
The batch job completes successfully, but the taskmanager working directory is 
not being removed.
{code:java}
[root@ip-1-2-3-4 container_1705470896818_0017_01_02]# ls -R -lrt 
/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_02
/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_02:
total 0
drwxr-xr-x 2 yarn yarn  6 Jan 18 08:34 tmp
drwxr-xr-x 4 yarn yarn 66 Jan 18 08:34 blobStorage
drwxr-xr-x 2 yarn yarn  6 Jan 18 08:34 slotAllocationSnapshots
drwxr-xr-x 2 yarn yarn  6 Jan 18 08:34 localState

/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_02/tmp:
total 0

/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_02/blobStorage:
total 0
drwxr-xr-x 2 yarn yarn 94 Jan 18 08:34 job_d11f7085314ef1fb04c4e12fe292185a
drwxr-xr-x 2 yarn yarn  6 Jan 18 08:34 incoming

/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_02/blobStorage/job_d11f7085314ef1fb04c4e12fe292185a:
total 12
-rw-r--r-- 1 yarn yarn 10323 Jan 18 08:34 
blob_p-cdd441a64b3ea6eed0058df02c6c10fd208c94a8-86d84864273dad1e8084d8ef0f5aad52

/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_02/blobStorage/incoming:
total 0

/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_02/slotAllocationSnapshots:
total 0

/mnt2/yarn/usercache/hadoop/appcache/application_1705470896818_0017/tm_container_1705470896818_0017_01_02/localState:
total 0


{code}
*Analysis*

1. The TaskManagerRunner removes the working directory only when its 'close' 
method is called, which never happens.
{code:java}
public void close() throws Exception {
try {
closeAsync().get();
} catch (ExecutionException e) {

ExceptionUtils.rethrowException(ExceptionUtils.stripExecutionException(e));
}
}

public CompletableFuture closeAsync() {
return closeAsync(Result.SUCCESS);
}
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34132) Batch WordCount job fails when run with AdaptiveBatch scheduler

2024-01-17 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-34132:
--
Description: 
Batch WordCount job fails when run with AdaptiveBatch scheduler.

*Repro Steps*
{code:java}
flink-yarn-session -Djobmanager.scheduler=adaptive -d

 flink run -d /usr/lib/flink/examples/batch/WordCount.jar --input 
s3://prabhuflinks3/INPUT --output s3://prabhuflinks3/OUT
{code}
*Error logs*
{code:java}
 The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error: java.util.concurrent.ExecutionException: 
java.lang.RuntimeException: 
org.apache.flink.runtime.client.JobInitializationException: Could not start the 
JobMaster.
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222)
at 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:105)
at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:851)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:245)
at 
org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1095)
at 
org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1189)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at 
org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at 
org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1189)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1157)
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.RuntimeException: 
org.apache.flink.runtime.client.JobInitializationException: Could not start the 
JobMaster.
at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321)
at 
org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1067)
at 
org.apache.flink.client.program.ContextEnvironment.executeAsync(ContextEnvironment.java:144)
at 
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:73)
at 
org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:106)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355)
... 12 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
org.apache.flink.runtime.client.JobInitializationException: Could not start the 
JobMaster.
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at 
org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1062)
... 20 more
Caused by: java.lang.RuntimeException: 
org.apache.flink.runtime.client.JobInitializationException: Could not start the 
JobMaster.
at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321)
at 
org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:75)
at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
at 
java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:457)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at 
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
Caused by: org.apache.flink.runtime.client.JobInitializationException: Could 
not start the JobMaster.
at 
org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.lambda$new$0(DefaultJobMasterServiceProcess.java:97)
at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
at 

[jira] [Created] (FLINK-34132) Batch WordCount job fails when run with AdaptiveBatch scheduler

2024-01-17 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-34132:
-

 Summary: Batch WordCount job fails when run with AdaptiveBatch 
scheduler
 Key: FLINK-34132
 URL: https://issues.apache.org/jira/browse/FLINK-34132
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.18.1, 1.17.1
Reporter: Prabhu Joseph


Batch WordCount job fails when run with AdaptiveBatch scheduler.

*Repro Steps*

{code}
flink-yarn-session -Djobmanager.scheduler=adaptive -d

 flink run -d /usr/lib/flink/examples/batch/WordCount.jar --input 
s3://prabhuflinks3/INPUT --output s3://prabhuflinks3/OUT
{code}

*Error logs*

{code}
 The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error: java.util.concurrent.ExecutionException: 
java.lang.RuntimeException: 
org.apache.flink.runtime.client.JobInitializationException: Could not start the 
JobMaster.
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222)
at 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:105)
at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:851)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:245)
at 
org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1095)
at 
org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1189)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at 
org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at 
org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1189)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1157)
Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.RuntimeException: 
org.apache.flink.runtime.client.JobInitializationException: Could not start the 
JobMaster.
at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321)
at 
org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1067)
at 
org.apache.flink.client.program.ContextEnvironment.executeAsync(ContextEnvironment.java:144)
at 
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:73)
at 
org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:106)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355)
... 12 more
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
org.apache.flink.runtime.client.JobInitializationException: Could not start the 
JobMaster.
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at 
org.apache.flink.api.java.ExecutionEnvironment.executeAsync(ExecutionEnvironment.java:1062)
... 20 more
Caused by: java.lang.RuntimeException: 
org.apache.flink.runtime.client.JobInitializationException: Could not start the 
JobMaster.
at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321)
at 
org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:75)
at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
at 
java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:457)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at 
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
Caused by: org.apache.flink.runtime.client.JobInitializationException: Could 
not start the JobMaster.
at 
org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.lambda$new$0(DefaultJobMasterServiceProcess.java:97)
at 

[jira] [Commented] (FLINK-33957) Generic Log-Based Incremental Checkpoint fails when Task Local Recovery is enabled

2023-12-28 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17801082#comment-17801082
 ] 

Prabhu Joseph commented on FLINK-33957:
---

[~Yanfei Lei] Could you take a look into this? Any idea when this issue could 
happen? Thanks.

> Generic Log-Based Incremental Checkpoint fails when Task Local Recovery is 
> enabled
> --
>
> Key: FLINK-33957
> URL: https://issues.apache.org/jira/browse/FLINK-33957
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.17.1
>Reporter: Prabhu Joseph
>Priority: Major
>
> Generic Log-Based Incremental Checkpoint fails when Task Local Recovery is 
> enabled. The issue happened to one of our users. I am trying to reproduce the 
> issue.
> {code}
> at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.handleExecutionException(AsyncCheckpointRunnable.java:298)
>  ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
>   at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:155)
>  ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
>   ... 3 more
> Caused by: org.apache.flink.util.SerializedThrowable: 
> java.lang.IllegalStateException: one checkpoint contains at most one 
> materializationID
>   at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) 
> ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
>   at 
> org.apache.flink.runtime.state.ChangelogTaskLocalStateStore.updateReference(ChangelogTaskLocalStateStore.java:92)
>  ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
>   at 
> org.apache.flink.runtime.state.ChangelogTaskLocalStateStore.storeLocalState(ChangelogTaskLocalStateStore.java:130)
>  ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
>   at 
> org.apache.flink.runtime.state.TaskStateManagerImpl.reportTaskStateSnapshots(TaskStateManagerImpl.java:140)
>  ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
>   at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.reportCompletedSnapshotStates(AsyncCheckpointRunnable.java:237)
>  ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
>   at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:136)
>  ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
>   ... 3 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33957) Generic Log-Based Incremental Checkpoint fails when Task Local Recovery is enabled

2023-12-28 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-33957:
--
Description: 
Generic Log-Based Incremental Checkpoint fails when Task Local Recovery is 
enabled. The issue happened to one of our users. I am trying to reproduce the 
issue.

{code}
at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.handleExecutionException(AsyncCheckpointRunnable.java:298)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:155)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
... 3 more
Caused by: org.apache.flink.util.SerializedThrowable: 
java.lang.IllegalStateException: one checkpoint contains at most one 
materializationID
at 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) 
~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.runtime.state.ChangelogTaskLocalStateStore.updateReference(ChangelogTaskLocalStateStore.java:92)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.runtime.state.ChangelogTaskLocalStateStore.storeLocalState(ChangelogTaskLocalStateStore.java:130)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.runtime.state.TaskStateManagerImpl.reportTaskStateSnapshots(TaskStateManagerImpl.java:140)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.reportCompletedSnapshotStates(AsyncCheckpointRunnable.java:237)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:136)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
... 3 more
{code}

  was:
Generic Log-Based Incremental Checkpoint fails when Task Local Recovery is 
enabled. The issue happened to one of our users. I am trying to reproduce the 
issue.

```
at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.handleExecutionException(AsyncCheckpointRunnable.java:298)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:155)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
... 3 more
Caused by: org.apache.flink.util.SerializedThrowable: 
java.lang.IllegalStateException: one checkpoint contains at most one 
materializationID
at 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) 
~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.runtime.state.ChangelogTaskLocalStateStore.updateReference(ChangelogTaskLocalStateStore.java:92)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.runtime.state.ChangelogTaskLocalStateStore.storeLocalState(ChangelogTaskLocalStateStore.java:130)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.runtime.state.TaskStateManagerImpl.reportTaskStateSnapshots(TaskStateManagerImpl.java:140)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.reportCompletedSnapshotStates(AsyncCheckpointRunnable.java:237)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:136)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
... 3 more
```


> Generic Log-Based Incremental Checkpoint fails when Task Local Recovery is 
> enabled
> --
>
> Key: FLINK-33957
> URL: https://issues.apache.org/jira/browse/FLINK-33957
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.17.1
>Reporter: Prabhu Joseph
>Priority: Major
>
> Generic Log-Based Incremental Checkpoint fails when Task Local Recovery is 
> enabled. The issue happened to one of our users. I am trying to reproduce the 
> issue.
> {code}
> at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.handleExecutionException(AsyncCheckpointRunnable.java:298)
>  ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
>   at 
> org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:155)
>  ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
>   ... 3 more
> Caused by: org.apache.flink.util.SerializedThrowable: 
> java.lang.IllegalStateException: one checkpoint contains at most one 
> materializationID
>   at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) 
> ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
>   at 
> org.apache.flink.runtime.state.ChangelogTaskLocalStateStore.updateReference(ChangelogTaskLocalStateStore.java:92)
>  

[jira] [Created] (FLINK-33957) Generic Log-Based Incremental Checkpoint fails when Task Local Recovery is enabled

2023-12-28 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-33957:
-

 Summary: Generic Log-Based Incremental Checkpoint fails when Task 
Local Recovery is enabled
 Key: FLINK-33957
 URL: https://issues.apache.org/jira/browse/FLINK-33957
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.17.1
Reporter: Prabhu Joseph


Generic Log-Based Incremental Checkpoint fails when Task Local Recovery is 
enabled. The issue happened to one of our users. I am trying to reproduce the 
issue.

```
at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.handleExecutionException(AsyncCheckpointRunnable.java:298)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:155)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
... 3 more
Caused by: org.apache.flink.util.SerializedThrowable: 
java.lang.IllegalStateException: one checkpoint contains at most one 
materializationID
at 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) 
~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.runtime.state.ChangelogTaskLocalStateStore.updateReference(ChangelogTaskLocalStateStore.java:92)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.runtime.state.ChangelogTaskLocalStateStore.storeLocalState(ChangelogTaskLocalStateStore.java:130)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.runtime.state.TaskStateManagerImpl.reportTaskStateSnapshots(TaskStateManagerImpl.java:140)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.reportCompletedSnapshotStates(AsyncCheckpointRunnable.java:237)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
at 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:136)
 ~[flink-dist-1.17.1-amzn-1.jar:1.17.1-amzn-1]
... 3 more
```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33862) Flink Unit Test Failures on 1.18.0

2023-12-18 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17798251#comment-17798251
 ] 

Prabhu Joseph commented on FLINK-33862:
---

Yes right. Thanks [~martijnvisser].

> Flink Unit Test Failures on 1.18.0
> --
>
> Key: FLINK-33862
> URL: https://issues.apache.org/jira/browse/FLINK-33862
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Flink Unit Test Failures on 1.18.0. There are 100+ unit test cases failing 
> due to below common issues.
> *Issue 1*
> {code:java}
> ./mvnw -DfailIfNoTests=false -Dmaven.test.failure.ignore=true 
> -Dtest=ExecutionPlanAfterExecutionTest test
> [INFO] Running org.apache.flink.client.program.ExecutionPlanAfterExecutionTest
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
>   at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:141)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoInvocationHandler.lambda$invokeRpc$1(PekkoInvocationHandler.java:268)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
>   at 
> org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1267)
>   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$1(ClassLoadingUtils.java:93)
>   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
>   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
>   at 
> org.apache.flink.runtime.concurrent.pekko.ScalaFutureUtils$1.onComplete(ScalaFutureUtils.java:47)
>   at org.apache.pekko.dispatch.OnComplete.internal(Future.scala:310)
>   at org.apache.pekko.dispatch.OnComplete.internal(Future.scala:307)
>   at org.apache.pekko.dispatch.japi$CallbackBridge.apply(Future.scala:234)
>   at org.apache.pekko.dispatch.japi$CallbackBridge.apply(Future.scala:231)
>   at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
>   at 
> org.apache.flink.runtime.concurrent.pekko.ScalaFutureUtils$DirectExecutionContext.execute(ScalaFutureUtils.java:65)
>   at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:72)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:288)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:288)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:288)
>   at org.apache.pekko.pattern.PromiseActorRef.$bang(AskSupport.scala:629)
>   at 
> org.apache.pekko.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:34)
>   at 
> org.apache.pekko.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:33)
>   at scala.concurrent.Future.$anonfun$andThen$1(Future.scala:536)
>   at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
>   at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
>   at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
>   at 
> org.apache.pekko.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:73)
>   at 
> org.apache.pekko.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:110)
>   at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> 

[jira] [Commented] (FLINK-33862) Flink Unit Test Failures on 1.18.0

2023-12-18 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17798143#comment-17798143
 ] 

Prabhu Joseph commented on FLINK-33862:
---

[FLINK-28203|https://issues.apache.org/jira/browse/FLINK-28203] has marked 
bundled dependecies as optional. Setting the flag flink.markBundledAsOptional 
to false has fixed the unit test failures. 

> Flink Unit Test Failures on 1.18.0
> --
>
> Key: FLINK-33862
> URL: https://issues.apache.org/jira/browse/FLINK-33862
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Flink Unit Test Failures on 1.18.0. There are 100+ unit test cases failing 
> due to below common issues.
> *Issue 1*
> {code:java}
> ./mvnw -DfailIfNoTests=false -Dmaven.test.failure.ignore=true 
> -Dtest=ExecutionPlanAfterExecutionTest test
> [INFO] Running org.apache.flink.client.program.ExecutionPlanAfterExecutionTest
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
>   at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:141)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoInvocationHandler.lambda$invokeRpc$1(PekkoInvocationHandler.java:268)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
>   at 
> org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1267)
>   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$1(ClassLoadingUtils.java:93)
>   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
>   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
>   at 
> org.apache.flink.runtime.concurrent.pekko.ScalaFutureUtils$1.onComplete(ScalaFutureUtils.java:47)
>   at org.apache.pekko.dispatch.OnComplete.internal(Future.scala:310)
>   at org.apache.pekko.dispatch.OnComplete.internal(Future.scala:307)
>   at org.apache.pekko.dispatch.japi$CallbackBridge.apply(Future.scala:234)
>   at org.apache.pekko.dispatch.japi$CallbackBridge.apply(Future.scala:231)
>   at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
>   at 
> org.apache.flink.runtime.concurrent.pekko.ScalaFutureUtils$DirectExecutionContext.execute(ScalaFutureUtils.java:65)
>   at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:72)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:288)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:288)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:288)
>   at org.apache.pekko.pattern.PromiseActorRef.$bang(AskSupport.scala:629)
>   at 
> org.apache.pekko.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:34)
>   at 
> org.apache.pekko.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:33)
>   at scala.concurrent.Future.$anonfun$andThen$1(Future.scala:536)
>   at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
>   at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
>   at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
>   at 
> org.apache.pekko.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:73)
>   at 
> 

[jira] [Updated] (FLINK-33862) Flink Unit Test Failures on 1.18.0

2023-12-15 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-33862:
--
Affects Version/s: 1.19.0

> Flink Unit Test Failures on 1.18.0
> --
>
> Key: FLINK-33862
> URL: https://issues.apache.org/jira/browse/FLINK-33862
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Flink Unit Test Failures on 1.18.0. There are 100+ unit test cases failing 
> due to below common issues.
> *Issue 1*
> {code:java}
> ./mvnw -DfailIfNoTests=false -Dmaven.test.failure.ignore=true 
> -Dtest=ExecutionPlanAfterExecutionTest test
> [INFO] Running org.apache.flink.client.program.ExecutionPlanAfterExecutionTest
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
>   at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:141)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
>   at 
> org.apache.flink.runtime.rpc.pekko.PekkoInvocationHandler.lambda$invokeRpc$1(PekkoInvocationHandler.java:268)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
>   at 
> org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1267)
>   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$1(ClassLoadingUtils.java:93)
>   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
>   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
>   at 
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
>   at 
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
>   at 
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
>   at 
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
>   at 
> org.apache.flink.runtime.concurrent.pekko.ScalaFutureUtils$1.onComplete(ScalaFutureUtils.java:47)
>   at org.apache.pekko.dispatch.OnComplete.internal(Future.scala:310)
>   at org.apache.pekko.dispatch.OnComplete.internal(Future.scala:307)
>   at org.apache.pekko.dispatch.japi$CallbackBridge.apply(Future.scala:234)
>   at org.apache.pekko.dispatch.japi$CallbackBridge.apply(Future.scala:231)
>   at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
>   at 
> org.apache.flink.runtime.concurrent.pekko.ScalaFutureUtils$DirectExecutionContext.execute(ScalaFutureUtils.java:65)
>   at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:72)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:288)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:288)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:288)
>   at org.apache.pekko.pattern.PromiseActorRef.$bang(AskSupport.scala:629)
>   at 
> org.apache.pekko.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:34)
>   at 
> org.apache.pekko.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:33)
>   at scala.concurrent.Future.$anonfun$andThen$1(Future.scala:536)
>   at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
>   at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
>   at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
>   at 
> org.apache.pekko.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:73)
>   at 
> org.apache.pekko.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:110)
>   at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
>   at 
> 

[jira] [Created] (FLINK-33862) Flink Unit Test Failures on 1.18.0

2023-12-15 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-33862:
-

 Summary: Flink Unit Test Failures on 1.18.0
 Key: FLINK-33862
 URL: https://issues.apache.org/jira/browse/FLINK-33862
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.18.0
Reporter: Prabhu Joseph


Flink Unit Test Failures on 1.18.0. There are 100+ unit test cases failing due 
to below common issues.

*Issue 1*
{code:java}
./mvnw -DfailIfNoTests=false -Dmaven.test.failure.ignore=true 
-Dtest=ExecutionPlanAfterExecutionTest test

[INFO] Running org.apache.flink.client.program.ExecutionPlanAfterExecutionTest
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
at 
org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:141)
at 
java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at 
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
at 
org.apache.flink.runtime.rpc.pekko.PekkoInvocationHandler.lambda$invokeRpc$1(PekkoInvocationHandler.java:268)
at 
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at 
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
at 
org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1267)
at 
org.apache.flink.runtime.concurrent.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$1(ClassLoadingUtils.java:93)
at 
org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
at 
org.apache.flink.runtime.concurrent.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
at 
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at 
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
at 
org.apache.flink.runtime.concurrent.pekko.ScalaFutureUtils$1.onComplete(ScalaFutureUtils.java:47)
at org.apache.pekko.dispatch.OnComplete.internal(Future.scala:310)
at org.apache.pekko.dispatch.OnComplete.internal(Future.scala:307)
at org.apache.pekko.dispatch.japi$CallbackBridge.apply(Future.scala:234)
at org.apache.pekko.dispatch.japi$CallbackBridge.apply(Future.scala:231)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
at 
org.apache.flink.runtime.concurrent.pekko.ScalaFutureUtils$DirectExecutionContext.execute(ScalaFutureUtils.java:65)
at 
scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:72)
at 
scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:288)
at 
scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:288)
at 
scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:288)
at org.apache.pekko.pattern.PromiseActorRef.$bang(AskSupport.scala:629)
at 
org.apache.pekko.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:34)
at 
org.apache.pekko.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:33)
at scala.concurrent.Future.$anonfun$andThen$1(Future.scala:536)
at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
at 
org.apache.pekko.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:73)
at 
org.apache.pekko.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:110)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at 
scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85)
at 
org.apache.pekko.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:110)
at 
org.apache.pekko.dispatch.TaskInvocation.run(AbstractDispatcher.scala:59)
at 

[jira] [Closed] (FLINK-33753) ContinuousFileReaderOperator consume records as mini batch

2023-12-06 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph closed FLINK-33753.
-
Resolution: Not A Problem

> ContinuousFileReaderOperator consume records as mini batch
> --
>
> Key: FLINK-33753
> URL: https://issues.apache.org/jira/browse/FLINK-33753
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> The ContinuousFileReaderOperator reads and collects the records from a split 
> in a loop. If the split size is large, then the loop will take more time, and 
> then the mailbox executor won't have a chance to process the checkpoint 
> barrier. This leads to checkpoint timing out. ContinuousFileReaderOperator 
> could be improved to consume the records in a mini batch, similar to Hudi's 
> StreamReadOperator (https://issues.apache.org/jira/browse/HUDI-2485).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33753) ContinuousFileReaderOperator consume records as mini batch

2023-12-06 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-33753:
--
Affects Version/s: 1.16.0
   (was: 1.18.0)

> ContinuousFileReaderOperator consume records as mini batch
> --
>
> Key: FLINK-33753
> URL: https://issues.apache.org/jira/browse/FLINK-33753
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> The ContinuousFileReaderOperator reads and collects the records from a split 
> in a loop. If the split size is large, then the loop will take more time, and 
> then the mailbox executor won't have a chance to process the checkpoint 
> barrier. This leads to checkpoint timing out. ContinuousFileReaderOperator 
> could be improved to consume the records in a mini batch, similar to Hudi's 
> StreamReadOperator (https://issues.apache.org/jira/browse/HUDI-2485).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33753) ContinuousFileReaderOperator consume records as mini batch

2023-12-05 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-33753:
-

 Summary: ContinuousFileReaderOperator consume records as mini batch
 Key: FLINK-33753
 URL: https://issues.apache.org/jira/browse/FLINK-33753
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.18.0
Reporter: Prabhu Joseph


The ContinuousFileReaderOperator reads and collects the records from a split in 
a loop. If the split size is large, then the loop will take more time, and then 
the mailbox executor won't have a chance to process the checkpoint barrier. 
This leads to checkpoint timing out. ContinuousFileReaderOperator could be 
improved to consume the records in a mini batch, similar to Hudi's 
StreamReadOperator (https://issues.apache.org/jira/browse/HUDI-2485).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33547) SQL primitive array type after upgrading to Flink 1.18.0

2023-11-23 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789317#comment-17789317
 ] 

Prabhu Joseph commented on FLINK-33547:
---

[~xccui]  Thanks for the details. This issue seems to be a duplicate of 
[FLINK-33523|https://issues.apache.org/jira/browse/FLINK-33523]. Shall we track 
this issue in FLINK-33523? Thanks.

> SQL primitive array type after upgrading to Flink 1.18.0
> 
>
> Key: FLINK-33547
> URL: https://issues.apache.org/jira/browse/FLINK-33547
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / Runtime
>Affects Versions: 1.18.0
>Reporter: Xingcan Cui
>Priority: Major
>
> We have some Flink SQL UDFs that use object array (Object[]) arguments and 
> take boxed arrays (e.g., Float[]) as parameters. After upgrading to Flink 
> 1.18.0, the data created by ARRAY[] SQL function became primitive arrays 
> (e.g., float[]) and it caused argument mismatch issues. I'm not sure if it's 
> expected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33523) DataType ARRAY fails to cast into Object[]

2023-11-23 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789316#comment-17789316
 ] 

Prabhu Joseph commented on FLINK-33523:
---

Thanks [~jeyhun]. Flink 1.17 does not have this enforcement. Flink 1.18 
[FLINK-31835|https://issues.apache.org/jira/browse/FLINK-31835] has added the 
enforcement and has broken multiple places like [FLINK-33547 
|https://issues.apache.org/jira/browse/FLINK-33547] and [Iceberg test 
cases|https://github.com/apache/iceberg/issues/8930]. 

> DataType ARRAY fails to cast into Object[]
> 
>
> Key: FLINK-33523
> URL: https://issues.apache.org/jira/browse/FLINK-33523
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.18.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> When upgrading Iceberg's Flink version to 1.18, we found the Flink-related 
> unit test case broken due to this issue. The below code used to work fine in 
> Flink 1.17 but failed after upgrading to 1.18. DataType ARRAY 
> fails to cast into Object[].
> *Error:*
> {code}
> Exception in thread "main" java.lang.ClassCastException: [I cannot be cast to 
> [Ljava.lang.Object;
> at FlinkArrayIntNotNullTest.main(FlinkArrayIntNotNullTest.java:18)
> {code}
> *Repro:*
> {code}
>   import org.apache.flink.table.data.ArrayData;
>   import org.apache.flink.table.data.GenericArrayData;
>   import org.apache.flink.table.api.EnvironmentSettings;
>   import org.apache.flink.table.api.TableEnvironment;
>   import org.apache.flink.table.api.TableResult;
>   public class FlinkArrayIntNotNullTest {
> public static void main(String[] args) throws Exception {
>   EnvironmentSettings settings = 
> EnvironmentSettings.newInstance().inBatchMode().build();
>   TableEnvironment env = TableEnvironment.create(settings);
>   env.executeSql("CREATE TABLE filesystemtable2 (id INT, data ARRAY NOT NULL>) WITH ('connector' = 'filesystem', 'path' = 
> '/tmp/FLINK/filesystemtable2', 'format'='json')");
>   env.executeSql("INSERT INTO filesystemtable2 VALUES (4,ARRAY [1,2,3])");
>   TableResult tableResult = env.executeSql("SELECT * from 
> filesystemtable2");
>   ArrayData actualArrayData = new GenericArrayData((Object[]) 
> tableResult.collect().next().getField(1));
> }
>   }
> {code}
> *Analysis:*
> 1. The code works fine with ARRAY datatype. The issue happens when using 
> ARRAY.
> 2. The code works fine when casting into int[] instead of Object[].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33536) Flink Table API CSV streaming sink fails with "IOException: Stream closed"

2023-11-13 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17785702#comment-17785702
 ] 

Prabhu Joseph commented on FLINK-33536:
---

Thanks [~samrat007].

> Flink Table API CSV streaming sink fails with "IOException: Stream closed"
> --
>
> Key: FLINK-33536
> URL: https://issues.apache.org/jira/browse/FLINK-33536
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems, Table SQL / API
>Affects Versions: 1.18.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Flink Table API CSV streaming sink fails with "IOException: Stream closed". 
> Prior to Flink 1.18, CSV streaming sink used to fail with 
> "S3RecoverableFsDataOutputStream cannot sync state to S3" which is fixed by 
> [FLINK-28513|https://issues.apache.org/jira/browse/FLINK-28513]. The fix 
> seems not complete, it fails with this issue now.
> *Repro*
> {code}
> SET 'execution.runtime-mode' = 'streaming';
> create table dummy_table (
>   id int,
>   data string
> ) with (
>   'connector' = 'filesystem',
>   'path' = 's3://prabhuflinks3/dummy_table/',
>   'format' = 'csv'
> );
> INSERT INTO dummy_table SELECT * FROM (VALUES (1, 'Hello World'), (2, 'Hi'), 
> (2, 'Hi'), (3, 'Hello'), (3, 'World'), (4, 'ADD'), (5, 'LINE'));
> {code}
> *Error*
> {code}
> Caused by: java.io.IOException: Stream closed.
>   at 
> org.apache.flink.core.fs.RefCountedFileWithStream.requireOpened(RefCountedFileWithStream.java:76)
>   at 
> org.apache.flink.core.fs.RefCountedFileWithStream.write(RefCountedFileWithStream.java:52)
>   at 
> org.apache.flink.core.fs.RefCountedBufferingFileStream.flush(RefCountedBufferingFileStream.java:104)
>   at 
> org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeAndUploadPart(S3RecoverableFsDataOutputStream.java:209)
>   at 
> org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeForCommit(S3RecoverableFsDataOutputStream.java:177)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter.closeForCommit(OutputStreamBasedPartFileWriter.java:75)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:65)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.closePartFile(Bucket.java:263)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.onProcessingTime(Bucket.java:379)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onProcessingTime(Buckets.java:338)
>   at 
> org.apache.flink.connector.file.table.stream.AbstractStreamingWriter.endInput(AbstractStreamingWriter.java:155)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.endOperatorInput(StreamOperatorWrapper.java:96)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.lambda$finish$0(StreamOperatorWrapper.java:149)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finish(StreamOperatorWrapper.java:149)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finish(StreamOperatorWrapper.java:156)
>   at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.finishOperators(RegularOperatorChain.java:115)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.endData(StreamTask.java:619)
>   at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.lambda$completeProcessing$0(SourceStreamTask.java:367)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$asCallable$5(FunctionUtils.java:126)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMail(MailboxProcessor.java:398)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:367)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:352)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:229)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858)
>   at 
> 

[jira] [Commented] (FLINK-33536) Flink Table API CSV streaming sink fails with "IOException: Stream closed"

2023-11-13 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17785573#comment-17785573
 ] 

Prabhu Joseph commented on FLINK-33536:
---

We are using flink-s3-fs-hadoop-1.18.0.jar.

> Flink Table API CSV streaming sink fails with "IOException: Stream closed"
> --
>
> Key: FLINK-33536
> URL: https://issues.apache.org/jira/browse/FLINK-33536
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems, Table SQL / API
>Affects Versions: 1.18.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Flink Table API CSV streaming sink fails with "IOException: Stream closed". 
> Prior to Flink 1.18, CSV streaming sink used to fail with 
> "S3RecoverableFsDataOutputStream cannot sync state to S3" which is fixed by 
> [FLINK-28513|https://issues.apache.org/jira/browse/FLINK-28513]. The fix 
> seems not complete, it fails with this issue now.
> *Repro*
> {code}
> SET 'execution.runtime-mode' = 'streaming';
> create table dummy_table (
>   id int,
>   data string
> ) with (
>   'connector' = 'filesystem',
>   'path' = 's3://prabhuflinks3/dummy_table/',
>   'format' = 'csv'
> );
> INSERT INTO dummy_table SELECT * FROM (VALUES (1, 'Hello World'), (2, 'Hi'), 
> (2, 'Hi'), (3, 'Hello'), (3, 'World'), (4, 'ADD'), (5, 'LINE'));
> {code}
> *Error*
> {code}
> Caused by: java.io.IOException: Stream closed.
>   at 
> org.apache.flink.core.fs.RefCountedFileWithStream.requireOpened(RefCountedFileWithStream.java:76)
>   at 
> org.apache.flink.core.fs.RefCountedFileWithStream.write(RefCountedFileWithStream.java:52)
>   at 
> org.apache.flink.core.fs.RefCountedBufferingFileStream.flush(RefCountedBufferingFileStream.java:104)
>   at 
> org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeAndUploadPart(S3RecoverableFsDataOutputStream.java:209)
>   at 
> org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeForCommit(S3RecoverableFsDataOutputStream.java:177)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter.closeForCommit(OutputStreamBasedPartFileWriter.java:75)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:65)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.closePartFile(Bucket.java:263)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.onProcessingTime(Bucket.java:379)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onProcessingTime(Buckets.java:338)
>   at 
> org.apache.flink.connector.file.table.stream.AbstractStreamingWriter.endInput(AbstractStreamingWriter.java:155)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.endOperatorInput(StreamOperatorWrapper.java:96)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.lambda$finish$0(StreamOperatorWrapper.java:149)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finish(StreamOperatorWrapper.java:149)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finish(StreamOperatorWrapper.java:156)
>   at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.finishOperators(RegularOperatorChain.java:115)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.endData(StreamTask.java:619)
>   at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.lambda$completeProcessing$0(SourceStreamTask.java:367)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$asCallable$5(FunctionUtils.java:126)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMail(MailboxProcessor.java:398)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:367)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:352)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:229)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858)
>   at 
> 

[jira] [Updated] (FLINK-33536) Flink Table API CSV streaming sink fails with "IOException: Stream closed"

2023-11-13 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-33536:
--
Summary: Flink Table API CSV streaming sink fails with "IOException: Stream 
closed"  (was: Flink Table API CSV streaming sink throws "IOException: Stream 
closed")

> Flink Table API CSV streaming sink fails with "IOException: Stream closed"
> --
>
> Key: FLINK-33536
> URL: https://issues.apache.org/jira/browse/FLINK-33536
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems, Table SQL / API
>Affects Versions: 1.18.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Flink Table API CSV streaming sink throws "IOException: Stream closed". Prior 
> to Flink 1.18, CSV streaming sink used to fail with 
> "S3RecoverableFsDataOutputStream cannot sync state to S3" which is fixed by 
> [FLINK-28513|https://issues.apache.org/jira/browse/FLINK-28513]. The fix 
> seems not complete, it fails with this issue now.
> *Repro*
> {code}
> SET 'execution.runtime-mode' = 'streaming';
> create table dummy_table (
>   id int,
>   data string
> ) with (
>   'connector' = 'filesystem',
>   'path' = 's3://prabhuflinks3/dummy_table/',
>   'format' = 'csv'
> );
> INSERT INTO dummy_table SELECT * FROM (VALUES (1, 'Hello World'), (2, 'Hi'), 
> (2, 'Hi'), (3, 'Hello'), (3, 'World'), (4, 'ADD'), (5, 'LINE'));
> {code}
> *Error*
> {code}
> Caused by: java.io.IOException: Stream closed.
>   at 
> org.apache.flink.core.fs.RefCountedFileWithStream.requireOpened(RefCountedFileWithStream.java:76)
>   at 
> org.apache.flink.core.fs.RefCountedFileWithStream.write(RefCountedFileWithStream.java:52)
>   at 
> org.apache.flink.core.fs.RefCountedBufferingFileStream.flush(RefCountedBufferingFileStream.java:104)
>   at 
> org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeAndUploadPart(S3RecoverableFsDataOutputStream.java:209)
>   at 
> org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeForCommit(S3RecoverableFsDataOutputStream.java:177)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter.closeForCommit(OutputStreamBasedPartFileWriter.java:75)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:65)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.closePartFile(Bucket.java:263)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.onProcessingTime(Bucket.java:379)
>   at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onProcessingTime(Buckets.java:338)
>   at 
> org.apache.flink.connector.file.table.stream.AbstractStreamingWriter.endInput(AbstractStreamingWriter.java:155)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.endOperatorInput(StreamOperatorWrapper.java:96)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.lambda$finish$0(StreamOperatorWrapper.java:149)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finish(StreamOperatorWrapper.java:149)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finish(StreamOperatorWrapper.java:156)
>   at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.finishOperators(RegularOperatorChain.java:115)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.endData(StreamTask.java:619)
>   at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.lambda$completeProcessing$0(SourceStreamTask.java:367)
>   at 
> org.apache.flink.util.function.FunctionUtils.lambda$asCallable$5(FunctionUtils.java:126)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMail(MailboxProcessor.java:398)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:367)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:352)
>   at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:229)
>   at 
> 

[jira] [Updated] (FLINK-33536) Flink Table API CSV streaming sink fails with "IOException: Stream closed"

2023-11-13 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-33536:
--
Description: 
Flink Table API CSV streaming sink fails with "IOException: Stream closed". 
Prior to Flink 1.18, CSV streaming sink used to fail with 
"S3RecoverableFsDataOutputStream cannot sync state to S3" which is fixed by 
[FLINK-28513|https://issues.apache.org/jira/browse/FLINK-28513]. The fix seems 
not complete, it fails with this issue now.

*Repro*

{code}
SET 'execution.runtime-mode' = 'streaming';

create table dummy_table (
  id int,
  data string
) with (
  'connector' = 'filesystem',
  'path' = 's3://prabhuflinks3/dummy_table/',
  'format' = 'csv'
);

INSERT INTO dummy_table SELECT * FROM (VALUES (1, 'Hello World'), (2, 'Hi'), 
(2, 'Hi'), (3, 'Hello'), (3, 'World'), (4, 'ADD'), (5, 'LINE'));

{code}


*Error*

{code}
Caused by: java.io.IOException: Stream closed.
at 
org.apache.flink.core.fs.RefCountedFileWithStream.requireOpened(RefCountedFileWithStream.java:76)
at 
org.apache.flink.core.fs.RefCountedFileWithStream.write(RefCountedFileWithStream.java:52)
at 
org.apache.flink.core.fs.RefCountedBufferingFileStream.flush(RefCountedBufferingFileStream.java:104)
at 
org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeAndUploadPart(S3RecoverableFsDataOutputStream.java:209)
at 
org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeForCommit(S3RecoverableFsDataOutputStream.java:177)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter.closeForCommit(OutputStreamBasedPartFileWriter.java:75)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:65)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.closePartFile(Bucket.java:263)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.onProcessingTime(Bucket.java:379)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onProcessingTime(Buckets.java:338)
at 
org.apache.flink.connector.file.table.stream.AbstractStreamingWriter.endInput(AbstractStreamingWriter.java:155)
at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.endOperatorInput(StreamOperatorWrapper.java:96)
at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.lambda$finish$0(StreamOperatorWrapper.java:149)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93)
at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finish(StreamOperatorWrapper.java:149)
at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finish(StreamOperatorWrapper.java:156)
at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.finishOperators(RegularOperatorChain.java:115)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.endData(StreamTask.java:619)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.lambda$completeProcessing$0(SourceStreamTask.java:367)
at 
org.apache.flink.util.function.FunctionUtils.lambda$asCallable$5(FunctionUtils.java:126)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMail(MailboxProcessor.java:398)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:367)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:352)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:229)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:807)
at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)
at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:932)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
at java.lang.Thread.run(Thread.java:750)
{code}

  was:
Flink Table API CSV streaming sink throws "IOException: Stream closed". Prior 
to Flink 1.18, CSV streaming sink used to fail with 
"S3RecoverableFsDataOutputStream cannot 

[jira] [Created] (FLINK-33536) Flink Table API CSV streaming sink throws "IOException: Stream closed"

2023-11-13 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-33536:
-

 Summary: Flink Table API CSV streaming sink throws "IOException: 
Stream closed"
 Key: FLINK-33536
 URL: https://issues.apache.org/jira/browse/FLINK-33536
 Project: Flink
  Issue Type: Bug
  Components: FileSystems, Table SQL / API
Affects Versions: 1.18.0
Reporter: Prabhu Joseph


Flink Table API CSV streaming sink throws "IOException: Stream closed". Prior 
to Flink 1.18, CSV streaming sink used to fail with 
"S3RecoverableFsDataOutputStream cannot sync state to S3" which is fixed by 
[FLINK-28513|https://issues.apache.org/jira/browse/FLINK-28513]. The fix seems 
not complete, it fails with this issue now.

*Repro*

{code}
SET 'execution.runtime-mode' = 'streaming';

create table dummy_table (
  id int,
  data string
) with (
  'connector' = 'filesystem',
  'path' = 's3://prabhuflinks3/dummy_table/',
  'format' = 'csv'
);

INSERT INTO dummy_table SELECT * FROM (VALUES (1, 'Hello World'), (2, 'Hi'), 
(2, 'Hi'), (3, 'Hello'), (3, 'World'), (4, 'ADD'), (5, 'LINE'));

{code}


*Error*

{code}
Caused by: java.io.IOException: Stream closed.
at 
org.apache.flink.core.fs.RefCountedFileWithStream.requireOpened(RefCountedFileWithStream.java:76)
at 
org.apache.flink.core.fs.RefCountedFileWithStream.write(RefCountedFileWithStream.java:52)
at 
org.apache.flink.core.fs.RefCountedBufferingFileStream.flush(RefCountedBufferingFileStream.java:104)
at 
org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeAndUploadPart(S3RecoverableFsDataOutputStream.java:209)
at 
org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeForCommit(S3RecoverableFsDataOutputStream.java:177)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter.closeForCommit(OutputStreamBasedPartFileWriter.java:75)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:65)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.closePartFile(Bucket.java:263)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.onProcessingTime(Bucket.java:379)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onProcessingTime(Buckets.java:338)
at 
org.apache.flink.connector.file.table.stream.AbstractStreamingWriter.endInput(AbstractStreamingWriter.java:155)
at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.endOperatorInput(StreamOperatorWrapper.java:96)
at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.lambda$finish$0(StreamOperatorWrapper.java:149)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93)
at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finish(StreamOperatorWrapper.java:149)
at 
org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.finish(StreamOperatorWrapper.java:156)
at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.finishOperators(RegularOperatorChain.java:115)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.endData(StreamTask.java:619)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.lambda$completeProcessing$0(SourceStreamTask.java:367)
at 
org.apache.flink.util.function.FunctionUtils.lambda$asCallable$5(FunctionUtils.java:126)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:93)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMail(MailboxProcessor.java:398)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:367)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:352)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:229)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:807)
at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)
at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:932)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
at 

[jira] [Updated] (FLINK-33529) PyFlink fails with "No module named 'cloudpickle"

2023-11-12 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-33529:
--
Attachment: batch_wc.py

> PyFlink fails with "No module named 'cloudpickle"
> -
>
> Key: FLINK-33529
> URL: https://issues.apache.org/jira/browse/FLINK-33529
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.18.0
> Environment: Python 3.7.16 or Python 3.9
> YARN
>Reporter: Prabhu Joseph
>Priority: Major
> Attachments: batch_wc.py, flink1.17-get_site_packages.py, 
> flink1.18-get_site_packages.py
>
>
> PyFlink fails with "No module named 'cloudpickle" on Flink 1.18. The same 
> program works fine on Flink 1.17. This is after the change 
> (https://issues.apache.org/jira/browse/FLINK-32034).
> *Repro:*
> {code}
> [hadoop@ip-1-2-3-4 ~]$ python --version
> Python 3.7.16
> [hadoop@ip-1-2-3-4 ~]$ rpm -qa | grep flink
> flink-1.18.0-1.amzn2.x86_64
> [hadoop@ip-1-2-3-4 ~]$ flink-yarn-session -d
> [hadoop@ip-1-2-3-4 ~]$ flink run -py /tmp/batch_wc.py --output 
> s3://prabhuflinks3/OUT2/
> {code}
> *Error*
> {code}
> ModuleNotFoundError: No module named 'cloudpickle'
>   at 
> org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createStageBundleFactory(BeamPythonFunctionRunner.java:656)
>   at 
> org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:281)
>   at 
> org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator.open(AbstractExternalPythonFunctionOperator.java:57)
>   at 
> org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator.open(AbstractStatelessFunctionOperator.java:92)
>   at 
> org.apache.flink.table.runtime.operators.python.table.PythonTableFunctionOperator.open(PythonTableFunctionOperator.java:114)
>   at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:753)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:728)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:693)
>   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)
>   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:922)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
> {code}
> *Analysis*
> 1. On Flink 1.17 and Python-3.7.16, 
> PythonEnvironmentManagerUtils#getSitePackagesPath used to return following 
> two paths
> {code}
> [root@ip-172-31-45-97 tmp]# python flink1.17-get_site_packages.py /tmp
> /tmp/lib/python3.7/site-packages
> /tmp/lib64/python3.7/site-packages
> {code}
> whereas Flink 1.18 (FLINK-32034) has changed the 
> PythonEnvironmentManagerUtils#getSitePackagesPath and only one path is 
> returned
> {code}
> [root@ip-172-31-45-97 tmp]# python flink1.18-get_site_packages.py /tmp
> /tmp/lib64/python3.7/site-packages
> [root@ip-172-31-45-97 tmp]#
> {code}
> The pyflink dependencies are installed in "/tmp/lib/python3.7/site-packages" 
> which is not returned by the getSitePackagesPath in Flink1.18 causing the 
> pyflink job failure.
> *Attached batch_wc.py, flink1.17-get_site_packages.py and 
> flink1.18-get_site_packages.py.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33529) PyFlink fails with "No module named 'cloudpickle"

2023-11-12 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-33529:
--
Attachment: flink1.17-get_site_packages.py
flink1.18-get_site_packages.py

> PyFlink fails with "No module named 'cloudpickle"
> -
>
> Key: FLINK-33529
> URL: https://issues.apache.org/jira/browse/FLINK-33529
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.18.0
> Environment: Python 3.7.16 or Python 3.9
> YARN
>Reporter: Prabhu Joseph
>Priority: Major
> Attachments: batch_wc.py, flink1.17-get_site_packages.py, 
> flink1.18-get_site_packages.py
>
>
> PyFlink fails with "No module named 'cloudpickle" on Flink 1.18. The same 
> program works fine on Flink 1.17. This is after the change 
> (https://issues.apache.org/jira/browse/FLINK-32034).
> *Repro:*
> {code}
> [hadoop@ip-1-2-3-4 ~]$ python --version
> Python 3.7.16
> [hadoop@ip-1-2-3-4 ~]$ rpm -qa | grep flink
> flink-1.18.0-1.amzn2.x86_64
> [hadoop@ip-1-2-3-4 ~]$ flink-yarn-session -d
> [hadoop@ip-1-2-3-4 ~]$ flink run -py /tmp/batch_wc.py --output 
> s3://prabhuflinks3/OUT2/
> {code}
> *Error*
> {code}
> ModuleNotFoundError: No module named 'cloudpickle'
>   at 
> org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createStageBundleFactory(BeamPythonFunctionRunner.java:656)
>   at 
> org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:281)
>   at 
> org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator.open(AbstractExternalPythonFunctionOperator.java:57)
>   at 
> org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator.open(AbstractStatelessFunctionOperator.java:92)
>   at 
> org.apache.flink.table.runtime.operators.python.table.PythonTableFunctionOperator.open(PythonTableFunctionOperator.java:114)
>   at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:753)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:728)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:693)
>   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)
>   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:922)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
> {code}
> *Analysis*
> 1. On Flink 1.17 and Python-3.7.16, 
> PythonEnvironmentManagerUtils#getSitePackagesPath used to return following 
> two paths
> {code}
> [root@ip-172-31-45-97 tmp]# python flink1.17-get_site_packages.py /tmp
> /tmp/lib/python3.7/site-packages
> /tmp/lib64/python3.7/site-packages
> {code}
> whereas Flink 1.18 (FLINK-32034) has changed the 
> PythonEnvironmentManagerUtils#getSitePackagesPath and only one path is 
> returned
> {code}
> [root@ip-172-31-45-97 tmp]# python flink1.18-get_site_packages.py /tmp
> /tmp/lib64/python3.7/site-packages
> [root@ip-172-31-45-97 tmp]#
> {code}
> The pyflink dependencies are installed in "/tmp/lib/python3.7/site-packages" 
> which is not returned by the getSitePackagesPath in Flink1.18 causing the 
> pyflink job failure.
> *Attached batch_wc.py, flink1.17-get_site_packages.py and 
> flink1.18-get_site_packages.py.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33529) PyFlink fails with "No module named 'cloudpickle"

2023-11-12 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-33529:
--
Description: 
PyFlink fails with "No module named 'cloudpickle" on Flink 1.18. The same 
program works fine on Flink 1.17. This is after the change 
(https://issues.apache.org/jira/browse/FLINK-32034).

*Repro:*

{code}
[hadoop@ip-1-2-3-4 ~]$ python --version
Python 3.7.16

[hadoop@ip-1-2-3-4 ~]$ rpm -qa | grep flink
flink-1.18.0-1.amzn2.x86_64

[hadoop@ip-1-2-3-4 ~]$ flink-yarn-session -d

[hadoop@ip-1-2-3-4 ~]$ flink run -py /tmp/batch_wc.py --output 
s3://prabhuflinks3/OUT2/
{code}


*Error*

{code}
ModuleNotFoundError: No module named 'cloudpickle'

at 
org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createStageBundleFactory(BeamPythonFunctionRunner.java:656)
at 
org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:281)
at 
org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator.open(AbstractExternalPythonFunctionOperator.java:57)
at 
org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator.open(AbstractStatelessFunctionOperator.java:92)
at 
org.apache.flink.table.runtime.operators.python.table.PythonTableFunctionOperator.open(PythonTableFunctionOperator.java:114)
at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:753)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:728)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:693)
at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)
at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:922)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
{code}

*Analysis*

1. On Flink 1.17 and Python-3.7.16, 
PythonEnvironmentManagerUtils#getSitePackagesPath used to return following two 
paths

{code}
[root@ip-172-31-45-97 tmp]# python flink1.17-get_site_packages.py /tmp
/tmp/lib/python3.7/site-packages
/tmp/lib64/python3.7/site-packages
{code}

whereas Flink 1.18 (FLINK-32034) has changed the 
PythonEnvironmentManagerUtils#getSitePackagesPath and only one path is returned

{code}
[root@ip-172-31-45-97 tmp]# python flink1.18-get_site_packages.py /tmp
/tmp/lib64/python3.7/site-packages
[root@ip-172-31-45-97 tmp]#
{code}

The pyflink dependencies are installed in "/tmp/lib/python3.7/site-packages" 
which is not returned by the getSitePackagesPath in Flink1.18 causing the 
pyflink job failure.

*Attached batch_wc.py, flink1.17-get_site_packages.py and 
flink1.18-get_site_packages.py.*


  was:
PyFlink fails with "No module named 'cloudpickle" on Flink 1.18. The same 
program works fine on Flink 1.17. This is after the change 
(https://issues.apache.org/jira/browse/FLINK-32034).

*Repro:*

{code}
[hadoop@ip-1-2-3-4 ~]$ python --version
Python 3.7.16

[hadoop@ip-1-2-3-4 ~]$ rpm -qa | grep flink
flink-1.18.0-1.amzn2.x86_64

[hadoop@ip-1-2-3-4 ~]$ flink-yarn-session -d

[hadoop@ip-1-2-3-4 ~]$ flink run -py /tmp/batch_wc.py --output 
s3://prabhuflinks3/OUT2/
{code}


*Error*

{code}
ModuleNotFoundError: No module named 'cloudpickle'

at 
org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createStageBundleFactory(BeamPythonFunctionRunner.java:656)
at 
org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:281)
at 
org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator.open(AbstractExternalPythonFunctionOperator.java:57)
at 
org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator.open(AbstractStatelessFunctionOperator.java:92)
at 
org.apache.flink.table.runtime.operators.python.table.PythonTableFunctionOperator.open(PythonTableFunctionOperator.java:114)
at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:753)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:728)
 

[jira] [Created] (FLINK-33529) PyFlink fails with "No module named 'cloudpickle"

2023-11-12 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-33529:
-

 Summary: PyFlink fails with "No module named 'cloudpickle"
 Key: FLINK-33529
 URL: https://issues.apache.org/jira/browse/FLINK-33529
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.18.0
 Environment: Python 3.7.16 or Python 3.9
YARN
Reporter: Prabhu Joseph


PyFlink fails with "No module named 'cloudpickle" on Flink 1.18. The same 
program works fine on Flink 1.17. This is after the change 
(https://issues.apache.org/jira/browse/FLINK-32034).

*Repro:*

{code}
[hadoop@ip-1-2-3-4 ~]$ python --version
Python 3.7.16

[hadoop@ip-1-2-3-4 ~]$ rpm -qa | grep flink
flink-1.18.0-1.amzn2.x86_64

[hadoop@ip-1-2-3-4 ~]$ flink-yarn-session -d

[hadoop@ip-1-2-3-4 ~]$ flink run -py /tmp/batch_wc.py --output 
s3://prabhuflinks3/OUT2/
{code}


*Error*

{code}
ModuleNotFoundError: No module named 'cloudpickle'

at 
org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createStageBundleFactory(BeamPythonFunctionRunner.java:656)
at 
org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:281)
at 
org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator.open(AbstractExternalPythonFunctionOperator.java:57)
at 
org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator.open(AbstractStatelessFunctionOperator.java:92)
at 
org.apache.flink.table.runtime.operators.python.table.PythonTableFunctionOperator.open(PythonTableFunctionOperator.java:114)
at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:753)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.call(StreamTaskActionExecutor.java:100)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:728)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:693)
at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)
at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:922)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
{code}

*Analysis*

1. On Flink 1.17 and Python-3.7.16, 
PythonEnvironmentManagerUtils#getSitePackagesPath used to return following two 
paths

{code}
[root@ip-172-31-45-97 tmp]# python flink1.17-get_site_packages.py /tmp
/tmp/lib/python3.7/site-packages
/tmp/lib64/python3.7/site-packages
{code}

whereas Flink 1.18 (FLINK-32034) has changed the 
PythonEnvironmentManagerUtils#getSitePackagesPath and only one path is returned

{code}
[root@ip-172-31-45-97 tmp]# python flink1.18-get_site_packages.py /tmp
/tmp/lib64/python3.7/site-packages
[root@ip-172-31-45-97 tmp]#
{code}

The pyflink dependencies are installed in "/tmp/lib/python3.7/site-packages" 
which is not returned by the getSitePackagesPath in Flink1.18 causing the 
pyflink job failure.

Attached batch_wc.py, flink1.17-get_site_packages.py and 
flink1.18-get_site_packages.py.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33523) DataType ARRAY fails to cast into Object[]

2023-11-11 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-33523:
--
Component/s: Table SQL / API

> DataType ARRAY fails to cast into Object[]
> 
>
> Key: FLINK-33523
> URL: https://issues.apache.org/jira/browse/FLINK-33523
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.18.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> When upgrading Iceberg's Flink version to 1.18, we found the Flink-related 
> unit test case broken due to this issue. The below code used to work fine in 
> Flink 1.17 but failed after upgrading to 1.18. DataType ARRAY 
> fails to cast into Object[].
> *Error:*
> {code}
> Exception in thread "main" java.lang.ClassCastException: [I cannot be cast to 
> [Ljava.lang.Object;
> at FlinkArrayIntNotNullTest.main(FlinkArrayIntNotNullTest.java:18)
> {code}
> *Repro:*
> {code}
>   import org.apache.flink.table.data.ArrayData;
>   import org.apache.flink.table.data.GenericArrayData;
>   import org.apache.flink.table.api.EnvironmentSettings;
>   import org.apache.flink.table.api.TableEnvironment;
>   import org.apache.flink.table.api.TableResult;
>   public class FlinkArrayIntNotNullTest {
> public static void main(String[] args) throws Exception {
>   EnvironmentSettings settings = 
> EnvironmentSettings.newInstance().inBatchMode().build();
>   TableEnvironment env = TableEnvironment.create(settings);
>   env.executeSql("CREATE TABLE filesystemtable2 (id INT, data ARRAY NOT NULL>) WITH ('connector' = 'filesystem', 'path' = 
> '/tmp/FLINK/filesystemtable2', 'format'='json')");
>   env.executeSql("INSERT INTO filesystemtable2 VALUES (4,ARRAY [1,2,3])");
>   TableResult tableResult = env.executeSql("SELECT * from 
> filesystemtable2");
>   ArrayData actualArrayData = new GenericArrayData((Object[]) 
> tableResult.collect().next().getField(1));
> }
>   }
> {code}
> *Analysis:*
> 1. The code works fine with ARRAY datatype. The issue happens when using 
> ARRAY.
> 2. The code works fine when casting into int[] instead of Object[].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33523) DataType ARRAY fails to cast into Object[]

2023-11-10 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-33523:
--
Description: 
When upgrading Iceberg's Flink version to 1.18, we found the Flink-related unit 
test case broken due to this issue. The below code used to work fine in Flink 
1.17 but failed after upgrading to 1.18. DataType ARRAY fails to 
cast into Object[].

*Error:*

{code}
Exception in thread "main" java.lang.ClassCastException: [I cannot be cast to 
[Ljava.lang.Object;
at FlinkArrayIntNotNullTest.main(FlinkArrayIntNotNullTest.java:18)
{code}

*Repro:*

{code}

  import org.apache.flink.table.data.ArrayData;
  import org.apache.flink.table.data.GenericArrayData;
  import org.apache.flink.table.api.EnvironmentSettings;
  import org.apache.flink.table.api.TableEnvironment;
  import org.apache.flink.table.api.TableResult;

  public class FlinkArrayIntNotNullTest {

public static void main(String[] args) throws Exception {

  EnvironmentSettings settings = 
EnvironmentSettings.newInstance().inBatchMode().build();
  TableEnvironment env = TableEnvironment.create(settings);

  env.executeSql("CREATE TABLE filesystemtable2 (id INT, data ARRAY) WITH ('connector' = 'filesystem', 'path' = 
'/tmp/FLINK/filesystemtable2', 'format'='json')");
  env.executeSql("INSERT INTO filesystemtable2 VALUES (4,ARRAY [1,2,3])");
  TableResult tableResult = env.executeSql("SELECT * from 
filesystemtable2");

  ArrayData actualArrayData = new GenericArrayData((Object[]) 
tableResult.collect().next().getField(1));
}
  }

{code}

*Analysis:*

1. The code works fine with ARRAY datatype. The issue happens when using 
ARRAY.
2. The code works fine when casting into int[] instead of Object[].



  was:
When upgrading Iceberg's Flink version to 1.18, we found the Flink-related unit 
test case broken due to this issue. The below code used to work fine in Flink 
1.17 but failed after upgrading to 1.18. DataType ARRAY fails to 
cast into Object[].

**Error:**

{code}
Exception in thread "main" java.lang.ClassCastException: [I cannot be cast to 
[Ljava.lang.Object;
at FlinkArrayIntNotNullTest.main(FlinkArrayIntNotNullTest.java:18)
{code}

**Repro:**

{code}

  import org.apache.flink.table.data.ArrayData;
  import org.apache.flink.table.data.GenericArrayData;
  import org.apache.flink.table.api.EnvironmentSettings;
  import org.apache.flink.table.api.TableEnvironment;
  import org.apache.flink.table.api.TableResult;

  public class FlinkArrayIntNotNullTest {

public static void main(String[] args) throws Exception {

  EnvironmentSettings settings = 
EnvironmentSettings.newInstance().inBatchMode().build();
  TableEnvironment env = TableEnvironment.create(settings);

  env.executeSql("CREATE TABLE filesystemtable2 (id INT, data ARRAY) WITH ('connector' = 'filesystem', 'path' = 
'/tmp/FLINK/filesystemtable2', 'format'='json')");
  env.executeSql("INSERT INTO filesystemtable2 VALUES (4,ARRAY [1,2,3])");
  TableResult tableResult = env.executeSql("SELECT * from 
filesystemtable2");

  ArrayData actualArrayData = new GenericArrayData((Object[]) 
tableResult.collect().next().getField(1));
}
  }

{code}





> DataType ARRAY fails to cast into Object[]
> 
>
> Key: FLINK-33523
> URL: https://issues.apache.org/jira/browse/FLINK-33523
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> When upgrading Iceberg's Flink version to 1.18, we found the Flink-related 
> unit test case broken due to this issue. The below code used to work fine in 
> Flink 1.17 but failed after upgrading to 1.18. DataType ARRAY 
> fails to cast into Object[].
> *Error:*
> {code}
> Exception in thread "main" java.lang.ClassCastException: [I cannot be cast to 
> [Ljava.lang.Object;
> at FlinkArrayIntNotNullTest.main(FlinkArrayIntNotNullTest.java:18)
> {code}
> *Repro:*
> {code}
>   import org.apache.flink.table.data.ArrayData;
>   import org.apache.flink.table.data.GenericArrayData;
>   import org.apache.flink.table.api.EnvironmentSettings;
>   import org.apache.flink.table.api.TableEnvironment;
>   import org.apache.flink.table.api.TableResult;
>   public class FlinkArrayIntNotNullTest {
> public static void main(String[] args) throws Exception {
>   EnvironmentSettings settings = 
> EnvironmentSettings.newInstance().inBatchMode().build();
>   TableEnvironment env = TableEnvironment.create(settings);
>   env.executeSql("CREATE TABLE filesystemtable2 (id INT, data ARRAY NOT NULL>) WITH ('connector' = 'filesystem', 'path' = 
> '/tmp/FLINK/filesystemtable2', 'format'='json')");
>   env.executeSql("INSERT INTO filesystemtable2 VALUES (4,ARRAY [1,2,3])");
>   TableResult tableResult = env.executeSql("SELECT 

[jira] [Created] (FLINK-33523) DataType ARRAY fails to cast into Object[]

2023-11-10 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-33523:
-

 Summary: DataType ARRAY fails to cast into Object[]
 Key: FLINK-33523
 URL: https://issues.apache.org/jira/browse/FLINK-33523
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.18.0
Reporter: Prabhu Joseph


When upgrading Iceberg's Flink version to 1.18, we found the Flink-related unit 
test case broken due to this issue. The below code used to work fine in Flink 
1.17 but failed after upgrading to 1.18. DataType ARRAY fails to 
cast into Object[].

**Error:**

{code}
Exception in thread "main" java.lang.ClassCastException: [I cannot be cast to 
[Ljava.lang.Object;
at FlinkArrayIntNotNullTest.main(FlinkArrayIntNotNullTest.java:18)
{code}

**Repro:**

{code}

  import org.apache.flink.table.data.ArrayData;
  import org.apache.flink.table.data.GenericArrayData;
  import org.apache.flink.table.api.EnvironmentSettings;
  import org.apache.flink.table.api.TableEnvironment;
  import org.apache.flink.table.api.TableResult;

  public class FlinkArrayIntNotNullTest {

public static void main(String[] args) throws Exception {

  EnvironmentSettings settings = 
EnvironmentSettings.newInstance().inBatchMode().build();
  TableEnvironment env = TableEnvironment.create(settings);

  env.executeSql("CREATE TABLE filesystemtable2 (id INT, data ARRAY) WITH ('connector' = 'filesystem', 'path' = 
'/tmp/FLINK/filesystemtable2', 'format'='json')");
  env.executeSql("INSERT INTO filesystemtable2 VALUES (4,ARRAY [1,2,3])");
  TableResult tableResult = env.executeSql("SELECT * from 
filesystemtable2");

  ArrayData actualArrayData = new GenericArrayData((Object[]) 
tableResult.collect().next().getField(1));
}
  }

{code}






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33473) Update Flink client to 1.18.0

2023-11-07 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783548#comment-17783548
 ] 

Prabhu Joseph commented on FLINK-33473:
---

Thanks [~fanrui]. Will close this one.

> Update Flink client to 1.18.0
> -
>
> Key: FLINK-33473
> URL: https://issues.apache.org/jira/browse/FLINK-33473
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Prabhu Joseph
>Priority: Major
>
> Update the operator flink dependency to Flink-1.18.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-33473) Update Flink client to 1.18.0

2023-11-07 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph resolved FLINK-33473.
---
Resolution: Duplicate

> Update Flink client to 1.18.0
> -
>
> Key: FLINK-33473
> URL: https://issues.apache.org/jira/browse/FLINK-33473
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Prabhu Joseph
>Priority: Major
>
> Update the operator flink dependency to Flink-1.18.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33473) Update Flink client to 1.18.0

2023-11-06 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-33473:
-

 Summary: Update Flink client to 1.18.0
 Key: FLINK-33473
 URL: https://issues.apache.org/jira/browse/FLINK-33473
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Prabhu Joseph


Update the operator flink dependency to Flink-1.18.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33358) Flink SQL Client fails to start in Flink on YARN

2023-10-24 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-33358:
-

 Summary: Flink SQL Client fails to start in Flink on YARN
 Key: FLINK-33358
 URL: https://issues.apache.org/jira/browse/FLINK-33358
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN, Table SQL / Client
Affects Versions: 1.18.0
Reporter: Prabhu Joseph


Flink SQL Client fails to start in Flink on YARN with below error
{code:java}
flink-yarn-session -tm 2048 -s 2 -d

/usr/lib/flink/bin/sql-client.sh 

Exception in thread "main" org.apache.flink.table.client.SqlClientException: 
Could not read from command line.
at 
org.apache.flink.table.client.cli.CliClient.getAndExecuteStatements(CliClient.java:221)
at 
org.apache.flink.table.client.cli.CliClient.executeInteractive(CliClient.java:179)
at 
org.apache.flink.table.client.cli.CliClient.executeInInteractiveMode(CliClient.java:121)
at 
org.apache.flink.table.client.cli.CliClient.executeInInteractiveMode(CliClient.java:114)
at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:169)
at org.apache.flink.table.client.SqlClient.start(SqlClient.java:118)
at 
org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:228)
at org.apache.flink.table.client.SqlClient.main(SqlClient.java:179)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.flink.table.client.config.SqlClientOptions
at 
org.apache.flink.table.client.cli.parser.SqlClientSyntaxHighlighter.highlight(SqlClientSyntaxHighlighter.java:59)
at 
org.jline.reader.impl.LineReaderImpl.getHighlightedBuffer(LineReaderImpl.java:3633)
at 
org.jline.reader.impl.LineReaderImpl.getDisplayedBufferWithPrompts(LineReaderImpl.java:3615)
at 
org.jline.reader.impl.LineReaderImpl.redisplay(LineReaderImpl.java:3554)
at 
org.jline.reader.impl.LineReaderImpl.doCleanup(LineReaderImpl.java:2340)
at 
org.jline.reader.impl.LineReaderImpl.cleanup(LineReaderImpl.java:2332)
at 
org.jline.reader.impl.LineReaderImpl.readLine(LineReaderImpl.java:626)
at 
org.apache.flink.table.client.cli.CliClient.getAndExecuteStatements(CliClient.java:194)
... 7 more
{code}
The issue is due to the old jline jar from Hadoop classpath 
(/usr/lib/hadoop-yarn/lib/jline-3.9.0.jar) taking first precedence. Flink-1.18 
requires jline-3.21.0.jar.

Placing flink-sql-client.jar (bundled with jline-3.21) before the Hadoop 
classpath fixes the issue.
{code:java}
diff --git a/flink-table/flink-sql-client/bin/sql-client.sh 
b/flink-table/flink-sql-client/bin/sql-client.sh
index 24746c5dc8..4ab8635de2 100755
--- a/flink-table/flink-sql-client/bin/sql-client.sh
+++ b/flink-table/flink-sql-client/bin/sql-client.sh
@@ -89,7 +89,7 @@ if [[ "$CC_CLASSPATH" =~ .*flink-sql-client.*.jar ]]; then
 elif [ -n "$FLINK_SQL_CLIENT_JAR" ]; then
 
 # start client with jar
-exec "$JAVA_RUN" $FLINK_ENV_JAVA_OPTS $JVM_ARGS "${log_setting[@]}" 
-classpath "`manglePathList 
"$CC_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS:$FLINK_SQL_CLIENT_JAR"`" 
org.apache.flink.table.client.SqlClient "$@" --jar "`manglePath 
$FLINK_SQL_CLIENT_JAR`"
+exec "$JAVA_RUN" $FLINK_ENV_JAVA_OPTS $JVM_ARGS "${log_setting[@]}" 
-classpath "`manglePathList 
"$CC_CLASSPATH:$FLINK_SQL_CLIENT_JAR:$INTERNAL_HADOOP_CLASSPATHS`" 
org.apache.flink.table.client.SqlClient "$@" --jar "`manglePath 
$FLINK_SQL_CLIENT_JAR`"
 
 # write error message to stderr
 else
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33358) Flink SQL Client fails to start in Flink on YARN

2023-10-24 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-33358:
--
Description: 
Flink SQL Client fails to start in Flink on YARN with below error
{code:java}
flink-yarn-session -tm 2048 -s 2 -d

/usr/lib/flink/bin/sql-client.sh 

Exception in thread "main" org.apache.flink.table.client.SqlClientException: 
Could not read from command line.
at 
org.apache.flink.table.client.cli.CliClient.getAndExecuteStatements(CliClient.java:221)
at 
org.apache.flink.table.client.cli.CliClient.executeInteractive(CliClient.java:179)
at 
org.apache.flink.table.client.cli.CliClient.executeInInteractiveMode(CliClient.java:121)
at 
org.apache.flink.table.client.cli.CliClient.executeInInteractiveMode(CliClient.java:114)
at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:169)
at org.apache.flink.table.client.SqlClient.start(SqlClient.java:118)
at 
org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:228)
at org.apache.flink.table.client.SqlClient.main(SqlClient.java:179)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.flink.table.client.config.SqlClientOptions
at 
org.apache.flink.table.client.cli.parser.SqlClientSyntaxHighlighter.highlight(SqlClientSyntaxHighlighter.java:59)
at 
org.jline.reader.impl.LineReaderImpl.getHighlightedBuffer(LineReaderImpl.java:3633)
at 
org.jline.reader.impl.LineReaderImpl.getDisplayedBufferWithPrompts(LineReaderImpl.java:3615)
at 
org.jline.reader.impl.LineReaderImpl.redisplay(LineReaderImpl.java:3554)
at 
org.jline.reader.impl.LineReaderImpl.doCleanup(LineReaderImpl.java:2340)
at 
org.jline.reader.impl.LineReaderImpl.cleanup(LineReaderImpl.java:2332)
at 
org.jline.reader.impl.LineReaderImpl.readLine(LineReaderImpl.java:626)
at 
org.apache.flink.table.client.cli.CliClient.getAndExecuteStatements(CliClient.java:194)
... 7 more
{code}
The issue is due to the old jline jar from Hadoop (3.3.3) classpath 
(/usr/lib/hadoop-yarn/lib/jline-3.9.0.jar) taking first precedence. Flink-1.18 
requires jline-3.21.0.jar.

Placing flink-sql-client.jar (bundled with jline-3.21) before the Hadoop 
classpath fixes the issue.
{code:java}
diff --git a/flink-table/flink-sql-client/bin/sql-client.sh 
b/flink-table/flink-sql-client/bin/sql-client.sh
index 24746c5dc8..4ab8635de2 100755
--- a/flink-table/flink-sql-client/bin/sql-client.sh
+++ b/flink-table/flink-sql-client/bin/sql-client.sh
@@ -89,7 +89,7 @@ if [[ "$CC_CLASSPATH" =~ .*flink-sql-client.*.jar ]]; then
 elif [ -n "$FLINK_SQL_CLIENT_JAR" ]; then
 
 # start client with jar
-exec "$JAVA_RUN" $FLINK_ENV_JAVA_OPTS $JVM_ARGS "${log_setting[@]}" 
-classpath "`manglePathList 
"$CC_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS:$FLINK_SQL_CLIENT_JAR"`" 
org.apache.flink.table.client.SqlClient "$@" --jar "`manglePath 
$FLINK_SQL_CLIENT_JAR`"
+exec "$JAVA_RUN" $FLINK_ENV_JAVA_OPTS $JVM_ARGS "${log_setting[@]}" 
-classpath "`manglePathList 
"$CC_CLASSPATH:$FLINK_SQL_CLIENT_JAR:$INTERNAL_HADOOP_CLASSPATHS`" 
org.apache.flink.table.client.SqlClient "$@" --jar "`manglePath 
$FLINK_SQL_CLIENT_JAR`"
 
 # write error message to stderr
 else
{code}

  was:
Flink SQL Client fails to start in Flink on YARN with below error
{code:java}
flink-yarn-session -tm 2048 -s 2 -d

/usr/lib/flink/bin/sql-client.sh 

Exception in thread "main" org.apache.flink.table.client.SqlClientException: 
Could not read from command line.
at 
org.apache.flink.table.client.cli.CliClient.getAndExecuteStatements(CliClient.java:221)
at 
org.apache.flink.table.client.cli.CliClient.executeInteractive(CliClient.java:179)
at 
org.apache.flink.table.client.cli.CliClient.executeInInteractiveMode(CliClient.java:121)
at 
org.apache.flink.table.client.cli.CliClient.executeInInteractiveMode(CliClient.java:114)
at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:169)
at org.apache.flink.table.client.SqlClient.start(SqlClient.java:118)
at 
org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:228)
at org.apache.flink.table.client.SqlClient.main(SqlClient.java:179)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.flink.table.client.config.SqlClientOptions
at 
org.apache.flink.table.client.cli.parser.SqlClientSyntaxHighlighter.highlight(SqlClientSyntaxHighlighter.java:59)
at 
org.jline.reader.impl.LineReaderImpl.getHighlightedBuffer(LineReaderImpl.java:3633)
at 
org.jline.reader.impl.LineReaderImpl.getDisplayedBufferWithPrompts(LineReaderImpl.java:3615)
at 
org.jline.reader.impl.LineReaderImpl.redisplay(LineReaderImpl.java:3554)
at 
org.jline.reader.impl.LineReaderImpl.doCleanup(LineReaderImpl.java:2340)
at 

[jira] [Updated] (FLINK-32253) Blocklist unblockResources does not update the pending resource request

2023-07-06 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-32253:
--
Component/s: Deployment / Kubernetes

> Blocklist unblockResources does not update the pending resource request
> ---
>
> Key: FLINK-32253
> URL: https://issues.apache.org/jira/browse/FLINK-32253
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Kubernetes, Deployment / YARN
>Affects Versions: 1.17.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>  Labels: pull-request-available
>
> Blocklist unblockResources does not update the existing pending resource 
> request from YARN/K8S. It updates only for the new resource requests. The 
> existing pending resource requests are not scheduled on the nodes which are 
> unblocked.
> ResourceManager#unblockResources has to notify 
> YarnResourceManagerDriver/KubernetesResourceManagerDriver so that the driver 
> updates the pending resource request. 
> YarnResourceManagerDriver#tryUpdateApplicationBlockList could be called 
> during unblockResources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32484) AdaptiveScheduler combined restart during scaling out

2023-06-29 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738530#comment-17738530
 ] 

Prabhu Joseph commented on FLINK-32484:
---

[~gyfora] If you are fine with this idea, could you assign this ticket to me? I 
can work on this and come up with a patch.

> AdaptiveScheduler combined restart during scaling out
> -
>
> Key: FLINK-32484
> URL: https://issues.apache.org/jira/browse/FLINK-32484
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.17.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> On a scaling-out operation, when nodes are added at different times, 
> AdaptiveScheduler does multiple restarts within a short period of time. On 
> one of our Flink jobs, we have seen AdaptiveScheduler restart the 
> ExecutionGraph every time there is a notification of new resources to it. 
> There are five restarts within 3 minutes.
> AdaptiveScheduler could provide a configurable restart window interval to the 
> user during which it combines the notified resources and restarts once when 
> the available resources are sufficient to fit the desired parallelism or when 
> the window times out. The window is created during the first notification of 
> resources received. This is applicable only when the execution graph is in 
> the executing state and not in the waiting for resources state.
>  
> {code:java}
> [root@ip-1-2-3-4 container_1688034805200_0002_01_01]# grep -i scale *
> jobmanager.log:2023-06-29 10:46:58,061 INFO  
> org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
> resources are available. Restarting job to scale up.
> jobmanager.log:2023-06-29 10:47:57,317 INFO  
> org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
> resources are available. Restarting job to scale up.
> jobmanager.log:2023-06-29 10:48:53,314 INFO  
> org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
> resources are available. Restarting job to scale up.
> jobmanager.log:2023-06-29 10:49:27,821 INFO  
> org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
> resources are available. Restarting job to scale up.
> jobmanager.log:2023-06-29 10:50:15,672 INFO  
> org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
> resources are available. Restarting job to scale up.
> [root@ip-1-2-3-4 container_1688034805200_0002_01_01]# {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32484) AdaptiveScheduler combined restart during scaling out

2023-06-29 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-32484:
--
Description: 
On a scaling-out operation, when nodes are added at different times, 
AdaptiveScheduler does multiple restarts within a short period of time. On one 
of our Flink jobs, we have seen AdaptiveScheduler restart the ExecutionGraph 
every time there is a notification of new resources to it. There are five 
restarts within 3 minutes.

AdaptiveScheduler could provide a configurable restart window interval to the 
user during which it combines the notified resources and restarts once when the 
available resources are sufficient to fit the desired parallelism or when the 
window times out. The window is created during the first notification of 
resources received. This is applicable only when the execution graph is in the 
executing state and not in the waiting for resources state.

 
{code:java}
[root@ip-1-2-3-4 container_1688034805200_0002_01_01]# grep -i scale *
jobmanager.log:2023-06-29 10:46:58,061 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:47:57,317 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:48:53,314 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:49:27,821 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:50:15,672 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
[root@ip-1-2-3-4 container_1688034805200_0002_01_01]# {code}
 

  was:
On a scaling-out operation, when nodes are added at different times, 
AdaptiveScheduler does multiple restarts within a short period of time. On one 
of our Flink jobs, we have seen AdaptiveScheduler restart the ExecutionGraph 
every time there is a notification of new resources to it. There are five 
restarts within 3 minutes.

AdaptiveScheduler could provide a configurable restart window interval to the 
user during which it combines the notified resources and restarts once when the 
available resources are sufficient to fit the desired parallelism or when the 
window times out. The window is created during the first notification of 
resources received. This is applicable only when the execution graph is in the 
executing state and not in the waiting for resources state.

 
{code:java}
[root@ip-172-31-40-185 container_1688034805200_0002_01_01]# grep -i scale *
jobmanager.log:2023-06-29 10:46:58,061 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:47:57,317 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:48:53,314 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:49:27,821 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:50:15,672 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
[root@ip-172-31-40-185 container_1688034805200_0002_01_01]# {code}
 


> AdaptiveScheduler combined restart during scaling out
> -
>
> Key: FLINK-32484
> URL: https://issues.apache.org/jira/browse/FLINK-32484
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.17.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> On a scaling-out operation, when nodes are added at different times, 
> AdaptiveScheduler does multiple restarts within a short period of time. On 
> one of our Flink jobs, we have seen AdaptiveScheduler restart the 
> ExecutionGraph every time there is a notification of new resources to it. 
> There are five restarts within 3 minutes.
> AdaptiveScheduler could provide a configurable restart window interval to the 
> user during which it combines the notified resources and restarts once when 
> the available resources are sufficient to fit the desired parallelism or when 
> the window times out. The window is created during the first notification of 
> resources received. This is applicable only when the execution graph is in 
> the 

[jira] [Updated] (FLINK-32484) AdaptiveScheduler combined restart during scaling out

2023-06-29 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-32484:
--
Description: 
On a scaling-out operation, when nodes are added at different times, 
AdaptiveScheduler does multiple restarts within a short period of time. On one 
of our Flink jobs, we have seen AdaptiveScheduler restart the ExecutionGraph 
every time there is a notification of new resources to it. There are five 
restarts within 3 minutes.

AdaptiveScheduler could provide a configurable restart window interval to the 
user during which it combines the notified resources and restarts once when the 
available resources are sufficient to fit the desired parallelism or when the 
window times out. The window is created during the first notification of 
resources received. This is applicable only when the execution graph is in the 
executing state and not in the waiting for resources state.

 
{code:java}
[root@ip-172-31-40-185 container_1688034805200_0002_01_01]# grep -i scale *
jobmanager.log:2023-06-29 10:46:58,061 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:47:57,317 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:48:53,314 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:49:27,821 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:50:15,672 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
[root@ip-172-31-40-185 container_1688034805200_0002_01_01]# {code}
 

  was:
On a scaling-out operation, when nodes are added at different times, 
AdaptiveScheduler does multiple restarts within a short period of time. On one 
of our Flink jobs, we have seen AdaptiveScheduler restart the ExecutionGraph 
every time there is a notification of new resources to it. There are five 
restarts within 3 minutes.

AdaptiveScheduler could provide a configurable restart window interval to the 
user during which it combines the notified resources and restarts once when the 
available resources are sufficient to fit the desired parallelism or when the 
window times out. This is applicable only when the execution graph is in the 
executing state and not in the waiting for resources state.

 
{code:java}
[root@ip-172-31-40-185 container_1688034805200_0002_01_01]# grep -i scale *
jobmanager.log:2023-06-29 10:46:58,061 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:47:57,317 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:48:53,314 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:49:27,821 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:50:15,672 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
[root@ip-172-31-40-185 container_1688034805200_0002_01_01]# {code}
 


> AdaptiveScheduler combined restart during scaling out
> -
>
> Key: FLINK-32484
> URL: https://issues.apache.org/jira/browse/FLINK-32484
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.17.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> On a scaling-out operation, when nodes are added at different times, 
> AdaptiveScheduler does multiple restarts within a short period of time. On 
> one of our Flink jobs, we have seen AdaptiveScheduler restart the 
> ExecutionGraph every time there is a notification of new resources to it. 
> There are five restarts within 3 minutes.
> AdaptiveScheduler could provide a configurable restart window interval to the 
> user during which it combines the notified resources and restarts once when 
> the available resources are sufficient to fit the desired parallelism or when 
> the window times out. The window is created during the first notification of 
> resources received. This is applicable only when the execution graph is in 
> the executing state and not in the waiting for resources state.
>  
> 

[jira] [Created] (FLINK-32484) AdaptiveScheduler combined restart during scaling out

2023-06-29 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-32484:
-

 Summary: AdaptiveScheduler combined restart during scaling out
 Key: FLINK-32484
 URL: https://issues.apache.org/jira/browse/FLINK-32484
 Project: Flink
  Issue Type: Improvement
  Components: API / Core
Affects Versions: 1.17.0
Reporter: Prabhu Joseph


On a scaling-out operation, when nodes are added at different times, 
AdaptiveScheduler does multiple restarts within a short period of time. On one 
of our Flink jobs, we have seen AdaptiveScheduler restart the ExecutionGraph 
every time there is a notification of new resources to it. There are five 
restarts within 3 minutes.

AdaptiveScheduler could provide a configurable restart window interval to the 
user during which it combines the notified resources and restarts once when the 
available resources are sufficient to fit the desired parallelism or when the 
window times out. This is applicable only when the execution graph is in the 
executing state and not in the waiting for resources state.

 
{code:java}
[root@ip-172-31-40-185 container_1688034805200_0002_01_01]# grep -i scale *
jobmanager.log:2023-06-29 10:46:58,061 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:47:57,317 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:48:53,314 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:49:27,821 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
jobmanager.log:2023-06-29 10:50:15,672 INFO  
org.apache.flink.runtime.scheduler.adaptive.AdaptiveScheduler [] - New 
resources are available. Restarting job to scale up.
[root@ip-172-31-40-185 container_1688034805200_0002_01_01]# {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32253) Blocklist unblockResources does not update the pending resource request

2023-06-06 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17729802#comment-17729802
 ] 

Prabhu Joseph commented on FLINK-32253:
---

Yes, sure. Could you assign this ticket to me. Thanks.

> Blocklist unblockResources does not update the pending resource request
> ---
>
> Key: FLINK-32253
> URL: https://issues.apache.org/jira/browse/FLINK-32253
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.17.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Blocklist unblockResources does not update the existing pending resource 
> request from YARN/K8S. It updates only for the new resource requests. The 
> existing pending resource requests are not scheduled on the nodes which are 
> unblocked.
> ResourceManager#unblockResources has to notify 
> YarnResourceManagerDriver/KubernetesResourceManagerDriver so that the driver 
> updates the pending resource request. 
> YarnResourceManagerDriver#tryUpdateApplicationBlockList could be called 
> during unblockResources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32253) Blocklist unblockResources does not update the pending resource request

2023-06-05 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17729590#comment-17729590
 ] 

Prabhu Joseph commented on FLINK-32253:
---

Thanks [~wanglijie] for your comment. For K8s, I think 
{{KubernetesResourceManagerDriver}} could get any pending task manager pods 
with node affinity notIn set to any of the unblocked nodes and update its pod 
spec. 




> Blocklist unblockResources does not update the pending resource request
> ---
>
> Key: FLINK-32253
> URL: https://issues.apache.org/jira/browse/FLINK-32253
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.17.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Blocklist unblockResources does not update the existing pending resource 
> request from YARN/K8S. It updates only for the new resource requests. The 
> existing pending resource requests are not scheduled on the nodes which are 
> unblocked.
> ResourceManager#unblockResources has to notify 
> YarnResourceManagerDriver/KubernetesResourceManagerDriver so that the driver 
> updates the pending resource request. 
> YarnResourceManagerDriver#tryUpdateApplicationBlockList could be called 
> during unblockResources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32253) Blocklist unblockResources does not update the pending resource request

2023-06-05 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-32253:
--
Issue Type: Improvement  (was: Bug)

> Blocklist unblockResources does not update the pending resource request
> ---
>
> Key: FLINK-32253
> URL: https://issues.apache.org/jira/browse/FLINK-32253
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.17.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Blocklist unblockResources does not update the existing pending resource 
> request from YARN/K8S. It updates only for the new resource requests. The 
> existing pending resource requests are not scheduled on the nodes which are 
> unblocked.
> ResourceManager#unblockResources has to notify 
> YarnResourceManagerDriver/KubernetesResourceManagerDriver so that the driver 
> updates the pending resource request. 
> YarnResourceManagerDriver#tryUpdateApplicationBlockList could be called 
> during unblockResources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32253) Blocklist unblockResources does not update the pending resource request

2023-06-05 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-32253:
--
Description: 
Blocklist unblockResources does not update the existing pending resource 
request from YARN/K8S. It updates only for the new resource requests. The 
existing pending resource requests are not scheduled on the nodes which are 
unblocked.

ResourceManager#unblockResources has to notify 
YarnResourceManagerDriver/KubernetesResourceManagerDriver so that the driver 
updates the pending resource request. 
YarnResourceManagerDriver#tryUpdateApplicationBlockList could be called during 
unblockResources.




  was:
Blocklist unblockResources does not update the existing pending resource 
request from YARN/K8S. It updates only for the new resource requests. The 
existing pending resource requests are not scheduled on the nodes which are 
unblocked.






> Blocklist unblockResources does not update the pending resource request
> ---
>
> Key: FLINK-32253
> URL: https://issues.apache.org/jira/browse/FLINK-32253
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Blocklist unblockResources does not update the existing pending resource 
> request from YARN/K8S. It updates only for the new resource requests. The 
> existing pending resource requests are not scheduled on the nodes which are 
> unblocked.
> ResourceManager#unblockResources has to notify 
> YarnResourceManagerDriver/KubernetesResourceManagerDriver so that the driver 
> updates the pending resource request. 
> YarnResourceManagerDriver#tryUpdateApplicationBlockList could be called 
> during unblockResources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32253) Blocklist unblockResources does not update the pending resource request

2023-06-05 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-32253:
--
Summary: Blocklist unblockResources does not update the pending resource 
request  (was: Blocklist unblockResources does not update for pending 
containers)

> Blocklist unblockResources does not update the pending resource request
> ---
>
> Key: FLINK-32253
> URL: https://issues.apache.org/jira/browse/FLINK-32253
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Blocklist unblockResources does not update the existing pending resource 
> request from YARN/K8S. It updates only for the new resource requests. The 
> existing pending resource requests are not scheduled on the nodes which are 
> unblocked.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32253) Blocklist unblockResources does not update for pending containers

2023-06-05 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-32253:
-

 Summary: Blocklist unblockResources does not update for pending 
containers
 Key: FLINK-32253
 URL: https://issues.apache.org/jira/browse/FLINK-32253
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.17.0
Reporter: Prabhu Joseph


Blocklist unblockResources does not update the existing pending resource 
request from YARN/K8S. It updates only for the new resource requests. The 
existing pending resource requests are not scheduled on the nodes which are 
unblocked.







--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32173) Flink Job Metrics returns stale values in the first request after an update in the values

2023-05-23 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-32173:
-

 Summary: Flink Job Metrics returns stale values in the first 
request after an update in the values
 Key: FLINK-32173
 URL: https://issues.apache.org/jira/browse/FLINK-32173
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Metrics
Affects Versions: 1.17.0
Reporter: Prabhu Joseph


Flink Job Metrics returns stale values in the first request after an update in 
the values.

*Repro:*

1. Run a flink job with fixed strategy and with multiple attempts 
{code}
restart-strategy: fixed-delay
restart-strategy.fixed-delay.attempts: 1


flink run -Dexecution.checkpointing.interval="10s" -d -c 
org.apache.flink.streaming.examples.wordcount.WordCount 
/usr/lib/flink/examples/streaming/WordCount.jar
{code}

2. Kill one of the TaskManager which will initiate job restart.

3. After job restarted, fetch any job metrics. The first time it returns stale 
(older) value 48.

{code}
[hadoop@ip-172-31-44-70 ~]$ curl 
http://jobmanager:52000/jobs/d24f7d74d541f1215a65395e0ebd898c/metrics?get=numRestarts
  | jq .
[
  {
"id": "numRestarts",
"value": "48"
  }
]
{code}

4. On subsequent runs, it returns the correct value.
{code}
[hadoop@ip-172-31-44-70 ~]$ curl 
http://jobmanager:52000/jobs/d24f7d74d541f1215a65395e0ebd898c/metrics?get=numRestarts
  | jq .
[
  {
"id": "numRestarts",
"value": "49"
  }
]
{code}

5. Repeat steps 2 to 5, which will show that the first request after an update 
to the metrics returns a previous value before the update. Only on the next 
request is the correct value returned.







--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-26425) Yarn aggregate log files in a rolling fashion for flink

2023-04-27 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17717324#comment-17717324
 ] 

Prabhu Joseph commented on FLINK-26425:
---

This is very useful to Flink on YARN, as without this, the rolled log files 
(jobmanager.log.n or taskmanager.log.n) won't be aggregated into the HDFS/S3 
Log Aggregation Path, which leads to a local disk full. 
[SPARK-15990|https://issues.apache.org/jira/browse/SPARK-15990] has added this 
support.

{quote}Could you give some examples of the rolling fashion under the settings 
of spark.yarn.rolledLog.excludePattern and spark.yarn.rolledLog.includePattern?
{quote}
Example include and exclude pattern will look like below. YARN will aggregate 
all rotated jobmanager log files and won't touch the out file.

flink.yarn.rolledLog.includePattern=jobmanager.log.*
flink.yarn.rolledLog.excludePattern=jobmanager.out

I can come up with a patch for this. Could you assign this to me? Thanks.

> Yarn aggregate log files in a rolling fashion for flink
> ---
>
> Key: FLINK-26425
> URL: https://issues.apache.org/jira/browse/FLINK-26425
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.14.3
>Reporter: wang da wei
>Priority: Major
>
> For the spark on yarn, you can set the 
> spark.yarn.rolledLog.includePattern、spark.yarn.rolledLog.excludePattern to 
> filter the log files and those log files will be aggregated in a rolling 
> fashion,Do you have similar Settings in Flink?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31518) HighAvailabilityServiceUtils resolves wrong webMonitorAddress in Standalone mode

2023-03-30 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706866#comment-17706866
 ] 

Prabhu Joseph commented on FLINK-31518:
---

Could you assign this ticket to my team member [~samrat007] to continue working 
on this patch. 

> HighAvailabilityServiceUtils resolves wrong webMonitorAddress in Standalone 
> mode
> 
>
> Key: FLINK-31518
> URL: https://issues.apache.org/jira/browse/FLINK-31518
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Configuration, Runtime / REST
>Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.17.1
>Reporter: Vineeth Naroju
>Priority: Major
>  Labels: pull-request-available
>
> {{HighAvailabilityServiceUtils#createHighAvailabilityServices()}}  in 
> {{HighAvailabilityMode.NONE}} mode uses 
> {{HighAvailabilityServiceUtils#getWebMonitorAddress()}} to get web monitor 
> address.
> {{HighAvailabilityServiceUtils#getWebMonitorAddress()}} reads only from 
> {{rest.port}} in {{flink-conf.yaml}} . If {{rest.port}} is not enabled, then 
> it returns {{0}} port number. It should dynamically fetch port number if 
> {{rest.port}} is disabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31518) HighAvailabilityServiceUtils resolves wrong webMonitorAddress in Standalone mode

2023-03-29 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706233#comment-17706233
 ] 

Prabhu Joseph edited comment on FLINK-31518 at 3/29/23 6:52 AM:


[~huwh]  Yes, that works. Could you please review the 
[patch|https://github.com/apache/flink/pull/22293]  when you get some time?


was (Author: prabhu joseph):
[~huwh]  Yes, that works. I have a [WIP patch 
|https://github.com/PrabhuJoseph/flink/commit/45e268f3d446880b75a666b974d2188bec2e0132#diff-49ab603f69009cb6e7b652cb99570a0df71da897298a09acbc56bf29b42bbc6b].
 Could you please review the same when you get some time?

> HighAvailabilityServiceUtils resolves wrong webMonitorAddress in Standalone 
> mode
> 
>
> Key: FLINK-31518
> URL: https://issues.apache.org/jira/browse/FLINK-31518
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Configuration, Runtime / REST
>Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.17.1
>Reporter: Vineeth Naroju
>Priority: Major
>  Labels: pull-request-available
>
> {{HighAvailabilityServiceUtils#createHighAvailabilityServices()}}  in 
> {{HighAvailabilityMode.NONE}} mode uses 
> {{HighAvailabilityServiceUtils#getWebMonitorAddress()}} to get web monitor 
> address.
> {{HighAvailabilityServiceUtils#getWebMonitorAddress()}} reads only from 
> {{rest.port}} in {{flink-conf.yaml}} . If {{rest.port}} is not enabled, then 
> it returns {{0}} port number. It should dynamically fetch port number if 
> {{rest.port}} is disabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31518) HighAvailabilityServiceUtils resolves wrong webMonitorAddress in Standalone mode

2023-03-29 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17706233#comment-17706233
 ] 

Prabhu Joseph commented on FLINK-31518:
---

[~huwh]  Yes, that works. I have a [WIP patch 
|https://github.com/PrabhuJoseph/flink/commit/45e268f3d446880b75a666b974d2188bec2e0132#diff-49ab603f69009cb6e7b652cb99570a0df71da897298a09acbc56bf29b42bbc6b].
 Could you please review the same when you get some time?

> HighAvailabilityServiceUtils resolves wrong webMonitorAddress in Standalone 
> mode
> 
>
> Key: FLINK-31518
> URL: https://issues.apache.org/jira/browse/FLINK-31518
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Configuration, Runtime / REST
>Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.17.1
>Reporter: Vineeth Naroju
>Priority: Major
>
> {{HighAvailabilityServiceUtils#createHighAvailabilityServices()}}  in 
> {{HighAvailabilityMode.NONE}} mode uses 
> {{HighAvailabilityServiceUtils#getWebMonitorAddress()}} to get web monitor 
> address.
> {{HighAvailabilityServiceUtils#getWebMonitorAddress()}} reads only from 
> {{rest.port}} in {{flink-conf.yaml}} . If {{rest.port}} is not enabled, then 
> it returns {{0}} port number. It should dynamically fetch port number if 
> {{rest.port}} is disabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31518) HighAvailabilityServiceUtils resolves wrong webMonitorAddress in Standalone mode

2023-03-23 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17704268#comment-17704268
 ] 

Prabhu Joseph commented on FLINK-31518:
---

Thanks for the pointers. [~vineethNaroju] and I have come up with a patch based 
on the above inputs.

{quote}And another one is which one is invoked first, the start of 
WebMonitorEndpoint or the creation of high available service?{quote}

High Availability Services {{StandaloneHaServices}} is initialised first with 
the configured {{clusterRestEndpointAddress}} which can be a range of ports, 
and then the {{RestServerEndpoint}} is started, which has the actual port.

The patch sets the configuration with the rest address and rest port, from 
which the {{StandaloneHaServices#getClusterRestEndpointLeaderRetriever}} 
retrieves the address and port. Similar ways are used in other places, like the 
one for the JobManager RPC Port.

This is the [WIP patch 
|https://github.com/PrabhuJoseph/flink/commit/45e268f3d446880b75a666b974d2188bec2e0132#diff-49ab603f69009cb6e7b652cb99570a0df71da897298a09acbc56bf29b42bbc6b].
 Please review it and let us know your comments.

> HighAvailabilityServiceUtils resolves wrong webMonitorAddress in Standalone 
> mode
> 
>
> Key: FLINK-31518
> URL: https://issues.apache.org/jira/browse/FLINK-31518
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Configuration, Runtime / REST
>Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.17.1
>Reporter: Vineeth Naroju
>Priority: Major
>
> {{HighAvailabilityServiceUtils#createHighAvailabilityServices()}}  in 
> {{HighAvailabilityMode.NONE}} mode uses 
> {{HighAvailabilityServiceUtils#getWebMonitorAddress()}} to get web monitor 
> address.
> {{HighAvailabilityServiceUtils#getWebMonitorAddress()}} reads only from 
> {{rest.port}} in {{flink-conf.yaml}} . If {{rest.port}} is not enabled, then 
> it returns {{0}} port number. It should dynamically fetch port number if 
> {{rest.port}} is disabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31518) HighAvailabilityServiceUtils resolves wrong webMonitorAddress in Standalone mode

2023-03-22 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17703618#comment-17703618
 ] 

Prabhu Joseph commented on FLINK-31518:
---

[~huwh]  It is a thread launched by AdaptiveScheduler which requires the 
webMonitorAddress. Currently there is no way to fetch the webMonitorAddress at 
AdaptiveScheduler initialization. There are two options we got

Option 1: Pass the webMonitorAddress all the way from 
DefaultDispatcherResourceManagerComponentFactory -> Dispatcher -> JobMaster -> 
DefaultSlotPoolServiceSchedulerFactory -> AdaptiveScheduler. There are so many 
factories, implementations, interface apis need to be changed to add 
webMonitorAddress parameter.

Option 2: Fix ClusterRestEndpointLeaderRetriever to provide the right 
webMonitorAddress when rest.port is a range.

The Option 1 does not look clean as so much api change required. I have to 
analyze further on Option 2. There is one more option we thought which is to 
have a shared context class with all the configs/informations that Dispatcher, 
JobMaster, SlotPoolService, AdaptiveScheduler can use. This requires change in 
only the context class if any new information has to be added later.

> HighAvailabilityServiceUtils resolves wrong webMonitorAddress in Standalone 
> mode
> 
>
> Key: FLINK-31518
> URL: https://issues.apache.org/jira/browse/FLINK-31518
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Configuration, Runtime / REST
>Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.17.1
>Reporter: Vineeth Naroju
>Priority: Major
>
> {{HighAvailabilityServiceUtils#createHighAvailabilityServices()}}  in 
> {{HighAvailabilityMode.NONE}} mode uses 
> {{HighAvailabilityServiceUtils#getWebMonitorAddress()}} to get web monitor 
> address.
> {{HighAvailabilityServiceUtils#getWebMonitorAddress()}} reads only from 
> {{rest.port}} in {{flink-conf.yaml}} . If {{rest.port}} is not enabled, then 
> it returns {{0}} port number. It should dynamically fetch port number if 
> {{rest.port}} is disabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31518) HighAvailabilityServiceUtils resolves wrong webMonitorAddress in Standalone mode

2023-03-21 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17703076#comment-17703076
 ] 

Prabhu Joseph commented on FLINK-31518:
---

[~huwh] We have a separate service runs part of JobManager which queries the 
REST API exposed by JobManager to access the Flink Metrics. It has to discover 
the Webmonitor Address. The rest.port is set to a range of port numbers. 
ClusterRestEndpointLeaderRetriever is used to discover the service which works 
fine when rest.port is set to a static port but returns 0 if a range is set. 
This jira intends to fix/improve this to return the right port number in case 
if range of ports is configured with rest.port.

{code}
LeaderRetrievalService webMonitorRetrievalService = 
highAvailabilityServices.getClusterRestEndpointLeaderRetriever();
try {
   webMonitorRetrievalService.start(new WebMonitorLeaderListener());
} catch (Exception e) {
  throw new RuntimeException(e);
}


private class WebMonitorLeaderListener implements LeaderRetrievalListener {
@Override
public void notifyLeaderAddress(final String leaderAddress, final UUID 
leaderSessionID) {
  System.out.println(leaderAddress);
}
{code}



> HighAvailabilityServiceUtils resolves wrong webMonitorAddress in Standalone 
> mode
> 
>
> Key: FLINK-31518
> URL: https://issues.apache.org/jira/browse/FLINK-31518
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Configuration, Runtime / REST
>Affects Versions: 1.16.0, 1.17.0, 1.16.1, 1.17.1
>Reporter: Vineeth Naroju
>Priority: Major
>
> {{HighAvailabilityServiceUtils#createHighAvailabilityServices()}}  in 
> {{HighAvailabilityMode.NONE}} mode uses 
> {{HighAvailabilityServiceUtils#getWebMonitorAddress()}} to get web monitor 
> address.
> {{HighAvailabilityServiceUtils#getWebMonitorAddress()}} reads only from 
> {{rest.port}} in {{flink-conf.yaml}} . If {{rest.port}} is not enabled, then 
> it returns {{0}} port number. It should dynamically fetch port number if 
> {{rest.port}} is disabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31196) Flink on YARN honors env.java.home

2023-02-23 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-31196:
--
Description: 
Flink on YARN, honor env.java.home, and launch JobManager and TaskMananger 
containers with a configured env.java.home.

One option to set the JAVA_HOME for Flink JobManager, TaskManager running on 
YARN
{code:java}
containerized.master.env.JAVA_HOME: /usr/lib/jvm/java-11-openjdk
containerized.taskmanager.env.JAVA_HOME: /usr/lib/jvm/java-11-openjdk {code}
 

  was:Flink on YARN, honor env.java.home, and launch JobManager and 
TaskMananger containers with a configured env.java.home.


> Flink on YARN honors env.java.home
> --
>
> Key: FLINK-31196
> URL: https://issues.apache.org/jira/browse/FLINK-31196
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.16.1
>Reporter: Prabhu Joseph
>Priority: Major
>
> Flink on YARN, honor env.java.home, and launch JobManager and TaskMananger 
> containers with a configured env.java.home.
> One option to set the JAVA_HOME for Flink JobManager, TaskManager running on 
> YARN
> {code:java}
> containerized.master.env.JAVA_HOME: /usr/lib/jvm/java-11-openjdk
> containerized.taskmanager.env.JAVA_HOME: /usr/lib/jvm/java-11-openjdk {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-31196) Flink on YARN honors env.java.home

2023-02-23 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph resolved FLINK-31196.
---
Resolution: Duplicate

> Flink on YARN honors env.java.home
> --
>
> Key: FLINK-31196
> URL: https://issues.apache.org/jira/browse/FLINK-31196
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.16.1
>Reporter: Prabhu Joseph
>Priority: Major
>
> Flink on YARN, honor env.java.home, and launch JobManager and TaskMananger 
> containers with a configured env.java.home.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31196) Flink on YARN honors env.java.home

2023-02-23 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17693003#comment-17693003
 ] 

Prabhu Joseph commented on FLINK-31196:
---

Thanks [~zlzhang0122]  for the update. Have missed to search properly in Flink 
Jira List before raising the ticket. We will close this as a duplicate and 
provide a patch for the original one, Flink-22091

> Flink on YARN honors env.java.home
> --
>
> Key: FLINK-31196
> URL: https://issues.apache.org/jira/browse/FLINK-31196
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.16.1
>Reporter: Prabhu Joseph
>Priority: Major
>
> Flink on YARN, honor env.java.home, and launch JobManager and TaskMananger 
> containers with a configured env.java.home.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31196) Flink on YARN honors env.java.home

2023-02-23 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-31196:
-

 Summary: Flink on YARN honors env.java.home
 Key: FLINK-31196
 URL: https://issues.apache.org/jira/browse/FLINK-31196
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / YARN
Affects Versions: 1.16.1
Reporter: Prabhu Joseph


Flink on YARN, honor env.java.home, and launch JobManager and TaskMananger 
containers with a configured env.java.home.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31080) Idle slots are not released due to a mismatch in time between DeclarativeSlotPoolService and SlotSharingSlotAllocator

2023-02-14 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-31080:
-

 Summary: Idle slots are not released due to a mismatch in time 
between DeclarativeSlotPoolService and SlotSharingSlotAllocator
 Key: FLINK-31080
 URL: https://issues.apache.org/jira/browse/FLINK-31080
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.16.1
Reporter: Prabhu Joseph


Due to a timing mismatch between {{DeclarativeSlotPoolService}} and 
{{{}SlotSharingSlotAllocator{}}}, idle slots are not released.

{{DeclarativeSlotPoolService}} uses {{{}SystemClock#relativeTimeMillis{}}}, 
i.e., {{{}System.nanoTime{}}}() / 1_000_000, while offering a slot, whereas 
{{SlotSharingSlotAllocator}} uses {{{}System.currentTimeMillis{}}}() while 
freeing the reserved slot. 

The idle timeout check fails wrongly as "{{{}System.currentTimeMillis(){}}}" 
will have a very high value compared to 
"{{{}SystemClock#relativeTimeMillis{}}}".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30277) Allow PYTHONPATH of Python Worker configurable

2023-01-31 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17682666#comment-17682666
 ] 

Prabhu Joseph commented on FLINK-30277:
---

[~dannycranmer] Could you assign this ticket to Samrat, who is working on the 
changes.

> Allow PYTHONPATH of Python Worker configurable
> --
>
> Key: FLINK-30277
> URL: https://issues.apache.org/jira/browse/FLINK-30277
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.16.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>  Labels: pull-request-available
>
> Currently, below are the ways Python Worker gets the Python Flink 
> Dependencies.
>  # Worker Node's System Python Path (/usr/local/lib64/python3.7/site-packages)
>  # Client passes the python Dependencies through -pyfs and --pyarch which is 
> localized into PYTHONPATH of Python Worker.
>  # Client passes the requirements through -pyreq which gets installed on 
> Worker Node and added into PYTHONPATH of Python Worker.
> This Jira intends to allow PYTHONPATH of Python Worker configurable where 
> Admin/Service provider can install the required python flink depencies on a 
> custom path (/usr/lib/pyflink/lib/python3.7/site-packages) on all Worker 
> Nodes and then set the path in the client machine configuration 
> flink-conf.yaml. This way it works without any configurations from the 
> Application Users and also without affecting any other components dependent 
> on System Python Path.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-15635) Allow passing a ClassLoader to EnvironmentSettings

2022-12-12 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646182#comment-17646182
 ] 

Prabhu Joseph edited comment on FLINK-15635 at 12/12/22 3:53 PM:
-

Thanks [~twalthr]. Using {{ClientWrapperClassLoader}} passed in the 
{{CommonExecSink}}#{{createSinkTransformation}} fixes the issue. Have created 
FLINK-30377 to handle this.


was (Author: prabhu joseph):
Thanks [~twalthr]. Using {{ClientWrapperClassLoader}} passed in the 
{{CommonExecSink#
createSinkTransformation}} fixes the issue. Have created FLINK-30377 to handle 
this.

> Allow passing a ClassLoader to EnvironmentSettings
> --
>
> Key: FLINK-15635
> URL: https://issues.apache.org/jira/browse/FLINK-15635
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: Francesco Guardiani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> We had a couple of class loading issues in the past because people forgot to 
> use the right classloader in {{flink-table}}. The SQL Client executor code 
> hacks a classloader into the planner process by using {{wrapClassLoader}} 
> that sets the threads context classloader.
> Instead we should allow passing a class loader to environment settings. This 
> class loader can be passed to the planner and can be stored in table 
> environment, table config, etc. to have a consistent class loading behavior.
> Having this in place should replace the need for 
> {{Thread.currentThread().getContextClassLoader()}} in the entire 
> {{flink-table}} module.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-15635) Allow passing a ClassLoader to EnvironmentSettings

2022-12-12 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646182#comment-17646182
 ] 

Prabhu Joseph commented on FLINK-15635:
---

Thanks [~twalthr]. Using {{ClientWrapperClassLoader}} passed in the 
{{CommonExecSink#
createSinkTransformation}} fixes the issue. Have created FLINK-30377 to handle 
this.

> Allow passing a ClassLoader to EnvironmentSettings
> --
>
> Key: FLINK-15635
> URL: https://issues.apache.org/jira/browse/FLINK-15635
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: Francesco Guardiani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> We had a couple of class loading issues in the past because people forgot to 
> use the right classloader in {{flink-table}}. The SQL Client executor code 
> hacks a classloader into the planner process by using {{wrapClassLoader}} 
> that sets the threads context classloader.
> Instead we should allow passing a class loader to environment settings. This 
> class loader can be passed to the planner and can be stored in table 
> environment, table config, etc. to have a consistent class loading behavior.
> Having this in place should replace the need for 
> {{Thread.currentThread().getContextClassLoader()}} in the entire 
> {{flink-table}} module.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30377) CommonExecSink does not use ClientWrapperClassLoader while extracting Type from KeyedStream

2022-12-12 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-30377:
--
Description: 
CommonExecSink does not use ClientWrapperClassLoader while extracting Type from 
KeyedStream. This will lead to ClassNotFoundException on user classes added 
through add jar command. This is working fine on Flink 1.15.

 
{code:java}
Caused by: java.lang.ClassNotFoundException: 
org.apache.hudi.common.model.HoodieRecord
 at java.net.URLClassLoader.findClass(URLClassLoader.java:387) ~[?:1.8.0_352]
 at java.lang.ClassLoader.loadClass(ClassLoader.java:418) ~[?:1.8.0_352]
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) ~[?:1.8.0_352]
 at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ~[?:1.8.0_352]
 at java.lang.Class.forName0(Native Method) ~[?:1.8.0_352]
 at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_352]
 at 
org.apache.flink.api.java.typeutils.TypeExtractionUtils.checkAndExtractLambda(TypeExtractionUtils.java:143)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:539)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:415)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:406)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.streaming.api.datastream.KeyedStream.(KeyedStream.java:116)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:300) 
~[flink-dist-1.16.0.jar:1.16.0]
 at org.apache.hudi.sink.utils.Pipelines.hoodieStreamWrite(Pipelines.java:339) 
~[?:?]
 at 
org.apache.hudi.table.HoodieTableSink.lambda$getSinkRuntimeProvider$0(HoodieTableSink.java:104)
 ~[?:?]
 at 
org.apache.hudi.adapter.DataStreamSinkProviderAdapter.consumeDataStream(DataStreamSinkProviderAdapter.java:35)
 ~[?:?]
 at 
org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecSink.applySinkProvider(CommonExecSink.java:483)
 ~[?:?]
 at 
org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecSink.createSinkTransformation(CommonExecSink.java:203)
 ~[?:?]
 at 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.java:176)
 ~[?:?]
 at 
org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:158)
 ~[?:?]
 at 
org.apache.flink.table.planner.delegation.StreamPlanner.$anonfun$translateToPlan$1(StreamPlanner.scala:85)
 ~[?:?]
 at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.Iterator.foreach(Iterator.scala:937) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.Iterator.foreach$(Iterator.scala:937) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1425) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.IterableLike.foreach(IterableLike.scala:70) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.IterableLike.foreach$(IterableLike.scala:69) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.AbstractIterable.foreach(Iterable.scala:54) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.TraversableLike.map(TraversableLike.scala:233) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.TraversableLike.map$(TraversableLike.scala:226) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.AbstractTraversable.map(Traversable.scala:104) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at 
org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:84)
 ~[?:?]
 at 
org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:197)
 ~[?:?]
 at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:1733)
 ~[flink-table-api-java-uber-1.16.0.jar:1.16.0]
 at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:825)
 ~[flink-table-api-java-uber-1.16.0.jar:1.16.0]
 at 
org.apache.flink.table.client.gateway.local.LocalExecutor.executeModifyOperations(LocalExecutor.java:219)
 ~[flink-sql-client-1.16.0.jar:1.16.0]

 {code}
 

 

  was:
CommonExecSink does not use ClientWrapperClassLoader while extracting Type from 
KeyedStream. This will lead to ClassNotFoundException on user classes added 
through add jar command.

 
{code:java}
Caused by: java.lang.ClassNotFoundException: 
org.apache.hudi.common.model.HoodieRecord
 at java.net.URLClassLoader.findClass(URLClassLoader.java:387) ~[?:1.8.0_352]
 at java.lang.ClassLoader.loadClass(ClassLoader.java:418) ~[?:1.8.0_352]
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) ~[?:1.8.0_352]
 at 

[jira] [Created] (FLINK-30377) CommonExecSink does not use ClientWrapperClassLoader while extracting Type from KeyedStream

2022-12-12 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-30377:
-

 Summary: CommonExecSink does not use ClientWrapperClassLoader 
while extracting Type from KeyedStream
 Key: FLINK-30377
 URL: https://issues.apache.org/jira/browse/FLINK-30377
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.16.0
Reporter: Prabhu Joseph


CommonExecSink does not use ClientWrapperClassLoader while extracting Type from 
KeyedStream. This will lead to ClassNotFoundException on user classes added 
through add jar command.

 
{code:java}
Caused by: java.lang.ClassNotFoundException: 
org.apache.hudi.common.model.HoodieRecord
 at java.net.URLClassLoader.findClass(URLClassLoader.java:387) ~[?:1.8.0_352]
 at java.lang.ClassLoader.loadClass(ClassLoader.java:418) ~[?:1.8.0_352]
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) ~[?:1.8.0_352]
 at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ~[?:1.8.0_352]
 at java.lang.Class.forName0(Native Method) ~[?:1.8.0_352]
 at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_352]
 at 
org.apache.flink.api.java.typeutils.TypeExtractionUtils.checkAndExtractLambda(TypeExtractionUtils.java:143)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:539)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:415)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:406)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.streaming.api.datastream.KeyedStream.(KeyedStream.java:116)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:300) 
~[flink-dist-1.16.0.jar:1.16.0]
 at org.apache.hudi.sink.utils.Pipelines.hoodieStreamWrite(Pipelines.java:339) 
~[?:?]
 at 
org.apache.hudi.table.HoodieTableSink.lambda$getSinkRuntimeProvider$0(HoodieTableSink.java:104)
 ~[?:?]
 at 
org.apache.hudi.adapter.DataStreamSinkProviderAdapter.consumeDataStream(DataStreamSinkProviderAdapter.java:35)
 ~[?:?]
 at 
org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecSink.applySinkProvider(CommonExecSink.java:483)
 ~[?:?]
 at 
org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecSink.createSinkTransformation(CommonExecSink.java:203)
 ~[?:?]
 at 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.java:176)
 ~[?:?]
 at 
org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:158)
 ~[?:?]
 at 
org.apache.flink.table.planner.delegation.StreamPlanner.$anonfun$translateToPlan$1(StreamPlanner.scala:85)
 ~[?:?]
 at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.Iterator.foreach(Iterator.scala:937) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.Iterator.foreach$(Iterator.scala:937) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1425) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.IterableLike.foreach(IterableLike.scala:70) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.IterableLike.foreach$(IterableLike.scala:69) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.AbstractIterable.foreach(Iterable.scala:54) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.TraversableLike.map(TraversableLike.scala:233) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.TraversableLike.map$(TraversableLike.scala:226) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.AbstractTraversable.map(Traversable.scala:104) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at 
org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:84)
 ~[?:?]
 at 
org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:197)
 ~[?:?]
 at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:1733)
 ~[flink-table-api-java-uber-1.16.0.jar:1.16.0]
 at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:825)
 ~[flink-table-api-java-uber-1.16.0.jar:1.16.0]
 at 
org.apache.flink.table.client.gateway.local.LocalExecutor.executeModifyOperations(LocalExecutor.java:219)
 ~[flink-sql-client-1.16.0.jar:1.16.0]

 {code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30375) SqlClient leaks flink-table-planner jar under /tmp

2022-12-12 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-30375:
-

 Summary: SqlClient leaks flink-table-planner jar under /tmp
 Key: FLINK-30375
 URL: https://issues.apache.org/jira/browse/FLINK-30375
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.16.0
Reporter: Prabhu Joseph


SqlClient leaks flink-table-planner jar under /tmp

 
{code:java}

[root@ip-172-1-1-3 lib]# ls -lrt /tmp/flink-table-planner_*
-rw-rw-r-- 1 hadoop hadoop 39138893 Dec 12 10:08 
/tmp/flink-table-planner_acada33f-a10b-4a4a-ad16-6bca25a67e10.jar
-rw-rw-r-- 1 hadoop hadoop 39138893 Dec 12 10:17 
/tmp/flink-table-planner_fb2f6e31-48a0-4c1e-ab7b-16129e776125.jar
-rw-rw-r-- 1 hadoop hadoop 39138893 Dec 12 10:22 
/tmp/flink-table-planner_83499393-1621-43de-953c-2000bc6967ce.jar
-rw-rw-r-- 1 hadoop hadoop 39138893 Dec 12 10:24 
/tmp/flink-table-planner_3aa798da-e794-4c6c-ad9f-49b4574da64b.jar
-rw-rw-r-- 1 hadoop hadoop 39138893 Dec 12 10:36 
/tmp/flink-table-planner_ea52d9ea-3148-4fc3-8a58-f83063bb14d5.jar
-rw-rw-r-- 1 hadoop hadoop 39138893 Dec 12 10:44 
/tmp/flink-table-planner_fdc64e21-6bc7-4e4a-b8b2-2a97629d0727.jar
-rw-rw-r-- 1 hadoop hadoop 39137545 Dec 12 11:05 
/tmp/flink-table-planner_84c00b13-c6b5-4e8b-80cb-5631a2fa3150.jar
-rw-rw-r-- 1 hadoop hadoop 39137601 Dec 12 11:10 
/tmp/flink-table-planner_d00b8a7b-e615-46f7-bd21-b5efa684c184.jar
-rw-rw-r-- 1 hadoop hadoop 39137601 Dec 12 11:11 
/tmp/flink-table-planner_a413520d-ce15-41f9-aa5e-05a30f7eaff5.jar {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-15635) Allow passing a ClassLoader to EnvironmentSettings

2022-12-11 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645804#comment-17645804
 ] 

Prabhu Joseph commented on FLINK-15635:
---

We are seeing below ClassNotFoundException in Sql Client after upgrading Hudi's 
Flink Version to 1.16. This is working fine with Flink-1.15.

The hudi-flink-bundle jar is added through Add Jar Command. Any idea on how to 
fix this issue. 

 
{code:java}
 Caused by: java.lang.ClassNotFoundException: 
org.apache.hudi.common.model.HoodieRecord
 at java.net.URLClassLoader.findClass(URLClassLoader.java:387) ~[?:1.8.0_352]
 at java.lang.ClassLoader.loadClass(ClassLoader.java:418) ~[?:1.8.0_352]
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) ~[?:1.8.0_352]
 at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ~[?:1.8.0_352]
 at java.lang.Class.forName0(Native Method) ~[?:1.8.0_352]
 at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_352]
 at 
org.apache.flink.api.java.typeutils.TypeExtractionUtils.checkAndExtractLambda(TypeExtractionUtils.java:143)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:539)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:415)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:406)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.streaming.api.datastream.KeyedStream.(KeyedStream.java:116)
 ~[flink-dist-1.16.0.jar:1.16.0]
 at 
org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:300) 
~[flink-dist-1.16.0.jar:1.16.0]
 at org.apache.hudi.sink.utils.Pipelines.hoodieStreamWrite(Pipelines.java:339) 
~[?:?]
 at 
org.apache.hudi.table.HoodieTableSink.lambda$getSinkRuntimeProvider$0(HoodieTableSink.java:104)
 ~[?:?]
 at 
org.apache.hudi.adapter.DataStreamSinkProviderAdapter.consumeDataStream(DataStreamSinkProviderAdapter.java:35)
 ~[?:?]
 at 
org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecSink.applySinkProvider(CommonExecSink.java:483)
 ~[?:?]
 at 
org.apache.flink.table.planner.plan.nodes.exec.common.CommonExecSink.createSinkTransformation(CommonExecSink.java:203)
 ~[?:?]
 at 
org.apache.flink.table.planner.plan.nodes.exec.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.java:176)
 ~[?:?]
 at 
org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase.translateToPlan(ExecNodeBase.java:158)
 ~[?:?]
 at 
org.apache.flink.table.planner.delegation.StreamPlanner.$anonfun$translateToPlan$1(StreamPlanner.scala:85)
 ~[?:?]
 at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.Iterator.foreach(Iterator.scala:937) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.Iterator.foreach$(Iterator.scala:937) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1425) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.IterableLike.foreach(IterableLike.scala:70) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.IterableLike.foreach$(IterableLike.scala:69) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.AbstractIterable.foreach(Iterable.scala:54) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.TraversableLike.map(TraversableLike.scala:233) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.TraversableLike.map$(TraversableLike.scala:226) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at scala.collection.AbstractTraversable.map(Traversable.scala:104) 
~[flink-scala_2.12-1.16.0.jar:1.16.0]
 at 
org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:84)
 ~[?:?]
 at 
org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:197)
 ~[?:?]
 at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:1733)
 ~[flink-table-api-java-uber-1.16.0.jar:1.16.0]
 at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:825)
 ~[flink-table-api-java-uber-1.16.0.jar:1.16.0]
 at 
org.apache.flink.table.client.gateway.local.LocalExecutor.executeModifyOperations(LocalExecutor.java:219)
 ~[flink-sql-client-1.16.0.jar:1.16.0]{code}
 

 

> Allow passing a ClassLoader to EnvironmentSettings
> --
>
> Key: FLINK-15635
> URL: https://issues.apache.org/jira/browse/FLINK-15635
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Timo Walther
>Assignee: Francesco Guardiani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> We had a couple of class 

[jira] [Commented] (FLINK-30277) Allow PYTHONPATH of Python Worker configurable

2022-12-02 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642542#comment-17642542
 ] 

Prabhu Joseph commented on FLINK-30277:
---

Thanks [~dannycranmer] 

> Allow PYTHONPATH of Python Worker configurable
> --
>
> Key: FLINK-30277
> URL: https://issues.apache.org/jira/browse/FLINK-30277
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.16.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> Currently, below are the ways Python Worker gets the Python Flink 
> Dependencies.
>  # Worker Node's System Python Path (/usr/local/lib64/python3.7/site-packages)
>  # Client passes the python Dependencies through -pyfs and --pyarch which is 
> localized into PYTHONPATH of Python Worker.
>  # Client passes the requirements through -pyreq which gets installed on 
> Worker Node and added into PYTHONPATH of Python Worker.
> This Jira intends to allow PYTHONPATH of Python Worker configurable where 
> Admin/Service provider can install the required python flink depencies on a 
> custom path (/usr/lib/pyflink/lib/python3.7/site-packages) on all Worker 
> Nodes and then set the path in the client machine configuration 
> flink-conf.yaml. This way it works without any configurations from the 
> Application Users and also without affecting any other components dependent 
> on System Python Path.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30277) Allow PYTHONPATH of Python Worker configurable

2022-12-02 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-30277:
--
Description: 
Currently, below are the ways Python Worker gets the Python Flink Dependencies.
 # Worker Node's System Python Path (/usr/local/lib64/python3.7/site-packages)
 # Client passes the python Dependencies through -pyfs and --pyarch which is 
localized into PYTHONPATH of Python Worker.
 # Client passes the requirements through -pyreq which gets installed on Worker 
Node and added into PYTHONPATH of Python Worker.

This Jira intends to allow PYTHONPATH of Python Worker configurable where 
Admin/Service provider can install the required python flink depencies on a 
custom path (/usr/lib/pyflink/lib/python3.7/site-packages) on all Worker Nodes 
and then set the path in the client machine configuration flink-conf.yaml. This 
way it works without any configurations from the Application Users and also 
without affecting any other components dependent on System Python Path.

 

  was:
Currently, below are the ways Python Worker gets the Python Flink Dependencies.
 # Worker Node's System Python Path (/usr/local/lib64/python3.7/site-packages)
 # Client passes the python Dependencies through -pyfs and --pyarch which is 
localized into PYTHONPATH of Python Worker.
 # Client passes the requirements through -pyreq which gets installed on Worker 
Node and added into PYTHONPATH of Python Worker.

This Jira intends to allow PYTHONPATH of Python Worker to be configurable where 
admin/Service provider can install the required python flink depencies on a 
custom path (/usr/lib/pyflink/lib/python3.7/site-packages) on all Worker Nodes 
and then set the path in the client machine configuration flink-conf.yaml. This 
way it works without any configurations from the Application Users and also 
without affecting any other components dependent on System Python Path.

 


> Allow PYTHONPATH of Python Worker configurable
> --
>
> Key: FLINK-30277
> URL: https://issues.apache.org/jira/browse/FLINK-30277
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.16.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently, below are the ways Python Worker gets the Python Flink 
> Dependencies.
>  # Worker Node's System Python Path (/usr/local/lib64/python3.7/site-packages)
>  # Client passes the python Dependencies through -pyfs and --pyarch which is 
> localized into PYTHONPATH of Python Worker.
>  # Client passes the requirements through -pyreq which gets installed on 
> Worker Node and added into PYTHONPATH of Python Worker.
> This Jira intends to allow PYTHONPATH of Python Worker configurable where 
> Admin/Service provider can install the required python flink depencies on a 
> custom path (/usr/lib/pyflink/lib/python3.7/site-packages) on all Worker 
> Nodes and then set the path in the client machine configuration 
> flink-conf.yaml. This way it works without any configurations from the 
> Application Users and also without affecting any other components dependent 
> on System Python Path.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30277) Allow PYTHONPATH of Python Worker configurable

2022-12-02 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-30277:
--
Summary: Allow PYTHONPATH of Python Worker configurable  (was: Allow 
PYTHONPATH of Python Worker to be configurable)

> Allow PYTHONPATH of Python Worker configurable
> --
>
> Key: FLINK-30277
> URL: https://issues.apache.org/jira/browse/FLINK-30277
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.16.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Currently, below are the ways Python Worker gets the Python Flink 
> Dependencies.
>  # Worker Node's System Python Path (/usr/local/lib64/python3.7/site-packages)
>  # Client passes the python Dependencies through -pyfs and --pyarch which is 
> localized into PYTHONPATH of Python Worker.
>  # Client passes the requirements through -pyreq which gets installed on 
> Worker Node and added into PYTHONPATH of Python Worker.
> This Jira intends to allow PYTHONPATH of Python Worker to be configurable 
> where admin/Service provider can install the required python flink depencies 
> on a custom path (/usr/lib/pyflink/lib/python3.7/site-packages) on all Worker 
> Nodes and then set the path in the client machine configuration 
> flink-conf.yaml. This way it works without any configurations from the 
> Application Users and also without affecting any other components dependent 
> on System Python Path.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30277) Allow PYTHONPATH of Python Worker to be configurable

2022-12-02 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-30277:
-

 Summary: Allow PYTHONPATH of Python Worker to be configurable
 Key: FLINK-30277
 URL: https://issues.apache.org/jira/browse/FLINK-30277
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Affects Versions: 1.16.0
Reporter: Prabhu Joseph


Currently, below are the ways Python Worker gets the Python Flink Dependencies.
 # Worker Node's System Python Path (/usr/local/lib64/python3.7/site-packages)
 # Client passes the python Dependencies through -pyfs and --pyarch which is 
localized into PYTHONPATH of Python Worker.
 # Client passes the requirements through -pyreq which gets installed on Worker 
Node and added into PYTHONPATH of Python Worker.

This Jira intends to allow PYTHONPATH of Python Worker to be configurable where 
admin/Service provider can install the required python flink depencies on a 
custom path (/usr/lib/pyflink/lib/python3.7/site-packages) on all Worker Nodes 
and then set the path in the client machine configuration flink-conf.yaml. This 
way it works without any configurations from the Application Users and also 
without affecting any other components dependent on System Python Path.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-30111) CacheRead fails with Intermediate data set with ID not found

2022-11-21 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-30111:
--
Description: 
CacheRead fails with below exception when running multiple parallel jobs in 
detached mode which all reads from a same CacheDataStream. The same application 
runs fine when either running in Attached Mode or when not using Cache.
{code:java}
2022-11-21 08:19:31,762 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - CacheRead -> 
Map -> Sink: Writer (1/1) 
(8002916773ad489098a05e6835288f29_cbc357ccb763df2852fee8c4fc7d55f2_0_0) 
switched from RUNNING to FAILED on container_1668960408356_0009_01_09 @ 
ip-172-31-38-144.us-west-2.compute.internal (dataPort=38433).
java.lang.IllegalArgumentException: Intermediate data set with ID 
f0d8150945d3e396b8c0a4f6a527a8ce not found.
at 
org.apache.flink.runtime.scheduler.ExecutionGraphHandler.requestPartitionState(ExecutionGraphHandler.java:173)
 ~[flink-dist-1.16.0.jar:1.16.0]
at 
org.apache.flink.runtime.scheduler.SchedulerBase.requestPartitionState(SchedulerBase.java:763)
 ~[flink-dist-1.16.0.jar:1.16.0]
at 
org.apache.flink.runtime.jobmaster.JobMaster.requestPartitionState(JobMaster.java:515)
 ~[flink-dist-1.16.0.jar:1.16.0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_342]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_342]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_342]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_342]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRpcInvocation$1(AkkaRpcActor.java:309)
 ~[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83)
 ~[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:307)
 ~[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:222)
 ~[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:84)
 ~[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:168)
 ~[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.actor.Actor.aroundReceive(Actor.scala:537) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.actor.Actor.aroundReceive$(Actor.scala:535) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.actor.ActorCell.invoke(ActorCell.scala:548) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.dispatch.Mailbox.run(Mailbox.scala:231) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.dispatch.Mailbox.exec(Mailbox.scala:243) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at 

[jira] [Created] (FLINK-30111) CacheRead fails with Intermediate data set with ID not found

2022-11-21 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-30111:
-

 Summary: CacheRead fails with Intermediate data set with ID not 
found
 Key: FLINK-30111
 URL: https://issues.apache.org/jira/browse/FLINK-30111
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.16.0
Reporter: Prabhu Joseph


CacheRead fails with below exception when running multiple parallel jobs in 
detached mode which all reads from a same CacheDataStream. The same application 
runs fine when either running in Attached Mode or when without using Cache.
{code:java}
2022-11-21 08:19:31,762 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - CacheRead -> 
Map -> Sink: Writer (1/1) 
(8002916773ad489098a05e6835288f29_cbc357ccb763df2852fee8c4fc7d55f2_0_0) 
switched from RUNNING to FAILED on container_1668960408356_0009_01_09 @ 
ip-172-31-38-144.us-west-2.compute.internal (dataPort=38433).
java.lang.IllegalArgumentException: Intermediate data set with ID 
f0d8150945d3e396b8c0a4f6a527a8ce not found.
at 
org.apache.flink.runtime.scheduler.ExecutionGraphHandler.requestPartitionState(ExecutionGraphHandler.java:173)
 ~[flink-dist-1.16.0.jar:1.16.0]
at 
org.apache.flink.runtime.scheduler.SchedulerBase.requestPartitionState(SchedulerBase.java:763)
 ~[flink-dist-1.16.0.jar:1.16.0]
at 
org.apache.flink.runtime.jobmaster.JobMaster.requestPartitionState(JobMaster.java:515)
 ~[flink-dist-1.16.0.jar:1.16.0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_342]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_342]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_342]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_342]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRpcInvocation$1(AkkaRpcActor.java:309)
 ~[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at 
org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83)
 ~[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:307)
 ~[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:222)
 ~[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:84)
 ~[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:168)
 ~[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.actor.Actor.aroundReceive(Actor.scala:537) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.actor.Actor.aroundReceive$(Actor.scala:535) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.actor.ActorCell.invoke(ActorCell.scala:548) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at akka.dispatch.Mailbox.run(Mailbox.scala:231) 
[flink-rpc-akka_e220e3c9-e81e-4259-9655-37a1f83e8a36.jar:1.16.0]
at 

[jira] [Created] (FLINK-30097) CachedDataStream java example in the document is not correct

2022-11-19 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-30097:
-

 Summary: CachedDataStream java example in the document is not 
correct
 Key: FLINK-30097
 URL: https://issues.apache.org/jira/browse/FLINK-30097
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.16.0
Reporter: Prabhu Joseph


CachedDataStream java example in the document is not correct - 
[https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/overview/#datastream-rarr-cacheddatastream]

 
{code:java}
DataStream dataStream = //...
CachedDataStream cachedDataStream = dataStream.cache();{code}

The example shows to invoke cache() on a DataStream instance but DataStream 
class does not have cache() method. The right usage is to call cache() on an 
instance of DataStreamSource/SideOutputDataStream/SingleOutputStreamOperator. 
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29680) Create Database with DatabaseName "default" fails

2022-10-18 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619537#comment-17619537
 ] 

Prabhu Joseph commented on FLINK-29680:
---

It worked with backtick. Thanks.

> Create Database with DatabaseName "default" fails
> -
>
> Key: FLINK-29680
> URL: https://issues.apache.org/jira/browse/FLINK-29680
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.15.1
>Reporter: Prabhu Joseph
>Priority: Minor
>
> Create Database with DatabaseName "default" fails in Default Dialect.
> *Exception:*
> {code:java}
> Caused by: org.apache.flink.sql.parser.impl.ParseException: Encountered 
> "default" at line 1, column 17.
> Was expecting one of:
>  ...
>  ...
>  ...
>  ...
>  ...
>  ...
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.generateParseException(FlinkSqlParserImpl.java:42459)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.jj_consume_token(FlinkSqlParserImpl.java:42270)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.IdentifierSegment(FlinkSqlParserImpl.java:26389)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.CompoundIdentifier(FlinkSqlParserImpl.java:26869)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreateDatabase(FlinkSqlParserImpl.java:4188)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreateExtended(FlinkSqlParserImpl.java:6583)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreate(FlinkSqlParserImpl.java:23158)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmt(FlinkSqlParserImpl.java:3507)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmtList(FlinkSqlParserImpl.java:2911)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.parseSqlStmtList(FlinkSqlParserImpl.java:287)
>  ~[?:?]
> at org.apache.calcite.sql.parser.SqlParser.parseStmtList(SqlParser.java:193) 
> ~[?:?]
> at 
> org.apache.flink.table.planner.parse.CalciteParser.parseSqlList(CalciteParser.java:77)
>  ~[?:?]
> at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:101)
>  ~[?:?]
> at 
> org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$parseStatement$1(LocalExecutor.java:172)
>  ~[flink-sql-client-1.15.1.jar:1.15.1]
> at 
> org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:88)
>  ~[flink-sql-client-1.15.1.jar:1.15.1]
> at 
> org.apache.flink.table.client.gateway.local.LocalExecutor.parseStatement(LocalExecutor.java:172)
>  ~[flink-sql-client-1.15.1.jar:1.15.1]
> ... 11 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29680) Create Database with DatabaseName "default" fails

2022-10-18 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-29680:
--
Description: 
Create Database with DatabaseName "default" fails in Default Dialect.

*Exception:*
{code:java}
Caused by: org.apache.flink.sql.parser.impl.ParseException: Encountered 
"default" at line 1, column 17.
Was expecting one of:
 ...
 ...
 ...
 ...
 ...
 ...

at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.generateParseException(FlinkSqlParserImpl.java:42459)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.jj_consume_token(FlinkSqlParserImpl.java:42270)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.IdentifierSegment(FlinkSqlParserImpl.java:26389)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.CompoundIdentifier(FlinkSqlParserImpl.java:26869)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreateDatabase(FlinkSqlParserImpl.java:4188)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreateExtended(FlinkSqlParserImpl.java:6583)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreate(FlinkSqlParserImpl.java:23158)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmt(FlinkSqlParserImpl.java:3507)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmtList(FlinkSqlParserImpl.java:2911)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.parseSqlStmtList(FlinkSqlParserImpl.java:287)
 ~[?:?]
at org.apache.calcite.sql.parser.SqlParser.parseStmtList(SqlParser.java:193) 
~[?:?]
at 
org.apache.flink.table.planner.parse.CalciteParser.parseSqlList(CalciteParser.java:77)
 ~[?:?]
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:101) 
~[?:?]
at 
org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$parseStatement$1(LocalExecutor.java:172)
 ~[flink-sql-client-1.15.1.jar:1.15.1]
at 
org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:88)
 ~[flink-sql-client-1.15.1.jar:1.15.1]
at 
org.apache.flink.table.client.gateway.local.LocalExecutor.parseStatement(LocalExecutor.java:172)
 ~[flink-sql-client-1.15.1.jar:1.15.1]
... 11 more
{code}

  was:
Create/Use Database with DatabaseName "default" fails in both Default and Hive 
Dialect.

*Default Dialect:*
{code:java}
Flink SQL> create database default;
 #fails{code}
*Hive Dialect:*
{code:java}
Flink SQL> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = 
'/etc/hive/conf');
Flink SQL> use catalog myhive;
Flink SQL> show databases;
+---+
| database name |
+---+
|   default |
|   emr |
+---+
2 rows in set
Flink SQL> use emr;
   #works
Flink SQL> use default;
   #fails
{code}
 

*Exception:*
{code:java}
Caused by: org.apache.flink.sql.parser.impl.ParseException: Encountered 
"default" at line 1, column 17.
Was expecting one of:
 ...
 ...
 ...
 ...
 ...
 ...

at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.generateParseException(FlinkSqlParserImpl.java:42459)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.jj_consume_token(FlinkSqlParserImpl.java:42270)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.IdentifierSegment(FlinkSqlParserImpl.java:26389)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.CompoundIdentifier(FlinkSqlParserImpl.java:26869)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreateDatabase(FlinkSqlParserImpl.java:4188)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreateExtended(FlinkSqlParserImpl.java:6583)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreate(FlinkSqlParserImpl.java:23158)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmt(FlinkSqlParserImpl.java:3507)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmtList(FlinkSqlParserImpl.java:2911)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.parseSqlStmtList(FlinkSqlParserImpl.java:287)
 ~[?:?]
at org.apache.calcite.sql.parser.SqlParser.parseStmtList(SqlParser.java:193) 
~[?:?]
at 
org.apache.flink.table.planner.parse.CalciteParser.parseSqlList(CalciteParser.java:77)
 ~[?:?]
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:101) 
~[?:?]
at 
org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$parseStatement$1(LocalExecutor.java:172)
 ~[flink-sql-client-1.15.1.jar:1.15.1]
at 
org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:88)
 ~[flink-sql-client-1.15.1.jar:1.15.1]
at 
org.apache.flink.table.client.gateway.local.LocalExecutor.parseStatement(LocalExecutor.java:172)
 ~[flink-sql-client-1.15.1.jar:1.15.1]
... 11 more
{code}


> Create Database with DatabaseName "default" fails
> 

[jira] [Updated] (FLINK-29680) Create Database with DatabaseName "default" fails

2022-10-18 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-29680:
--
Summary: Create Database with DatabaseName "default" fails  (was: 
Create/Use Database with DatabaseName "default" fails)

> Create Database with DatabaseName "default" fails
> -
>
> Key: FLINK-29680
> URL: https://issues.apache.org/jira/browse/FLINK-29680
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.15.1
>Reporter: Prabhu Joseph
>Priority: Minor
>
> Create/Use Database with DatabaseName "default" fails in both Default and 
> Hive Dialect.
> *Default Dialect:*
> {code:java}
> Flink SQL> create database default;
>  #fails{code}
> *Hive Dialect:*
> {code:java}
> Flink SQL> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = 
> '/etc/hive/conf');
> Flink SQL> use catalog myhive;
> Flink SQL> show databases;
> +---+
> | database name |
> +---+
> |   default |
> |   emr |
> +---+
> 2 rows in set
> Flink SQL> use emr;
>#works
> Flink SQL> use default;
>#fails
> {code}
>  
> *Exception:*
> {code:java}
> Caused by: org.apache.flink.sql.parser.impl.ParseException: Encountered 
> "default" at line 1, column 17.
> Was expecting one of:
>  ...
>  ...
>  ...
>  ...
>  ...
>  ...
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.generateParseException(FlinkSqlParserImpl.java:42459)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.jj_consume_token(FlinkSqlParserImpl.java:42270)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.IdentifierSegment(FlinkSqlParserImpl.java:26389)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.CompoundIdentifier(FlinkSqlParserImpl.java:26869)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreateDatabase(FlinkSqlParserImpl.java:4188)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreateExtended(FlinkSqlParserImpl.java:6583)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreate(FlinkSqlParserImpl.java:23158)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmt(FlinkSqlParserImpl.java:3507)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmtList(FlinkSqlParserImpl.java:2911)
>  ~[?:?]
> at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.parseSqlStmtList(FlinkSqlParserImpl.java:287)
>  ~[?:?]
> at org.apache.calcite.sql.parser.SqlParser.parseStmtList(SqlParser.java:193) 
> ~[?:?]
> at 
> org.apache.flink.table.planner.parse.CalciteParser.parseSqlList(CalciteParser.java:77)
>  ~[?:?]
> at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:101)
>  ~[?:?]
> at 
> org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$parseStatement$1(LocalExecutor.java:172)
>  ~[flink-sql-client-1.15.1.jar:1.15.1]
> at 
> org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:88)
>  ~[flink-sql-client-1.15.1.jar:1.15.1]
> at 
> org.apache.flink.table.client.gateway.local.LocalExecutor.parseStatement(LocalExecutor.java:172)
>  ~[flink-sql-client-1.15.1.jar:1.15.1]
> ... 11 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29680) Create/Use Database with DatabaseName "default" fails

2022-10-18 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-29680:
-

 Summary: Create/Use Database with DatabaseName "default" fails
 Key: FLINK-29680
 URL: https://issues.apache.org/jira/browse/FLINK-29680
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.15.1
Reporter: Prabhu Joseph


Create/Use Database with DatabaseName "default" fails in both Default and Hive 
Dialect.

*Default Dialect:*
{code:java}
Flink SQL> create database default;
 #fails{code}
*Hive Dialect:*
{code:java}
Flink SQL> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = 
'/etc/hive/conf');
Flink SQL> use catalog myhive;
Flink SQL> show databases;
+---+
| database name |
+---+
|   default |
|   emr |
+---+
2 rows in set
Flink SQL> use emr;
   #works
Flink SQL> use default;
   #fails
{code}
 

*Exception:*
{code:java}
Caused by: org.apache.flink.sql.parser.impl.ParseException: Encountered 
"default" at line 1, column 17.
Was expecting one of:
 ...
 ...
 ...
 ...
 ...
 ...

at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.generateParseException(FlinkSqlParserImpl.java:42459)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.jj_consume_token(FlinkSqlParserImpl.java:42270)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.IdentifierSegment(FlinkSqlParserImpl.java:26389)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.CompoundIdentifier(FlinkSqlParserImpl.java:26869)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreateDatabase(FlinkSqlParserImpl.java:4188)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreateExtended(FlinkSqlParserImpl.java:6583)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlCreate(FlinkSqlParserImpl.java:23158)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmt(FlinkSqlParserImpl.java:3507)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmtList(FlinkSqlParserImpl.java:2911)
 ~[?:?]
at 
org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.parseSqlStmtList(FlinkSqlParserImpl.java:287)
 ~[?:?]
at org.apache.calcite.sql.parser.SqlParser.parseStmtList(SqlParser.java:193) 
~[?:?]
at 
org.apache.flink.table.planner.parse.CalciteParser.parseSqlList(CalciteParser.java:77)
 ~[?:?]
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:101) 
~[?:?]
at 
org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$parseStatement$1(LocalExecutor.java:172)
 ~[flink-sql-client-1.15.1.jar:1.15.1]
at 
org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:88)
 ~[flink-sql-client-1.15.1.jar:1.15.1]
at 
org.apache.flink.table.client.gateway.local.LocalExecutor.parseStatement(LocalExecutor.java:172)
 ~[flink-sql-client-1.15.1.jar:1.15.1]
... 11 more
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29432) Replace GenericUDFNvl with GenericUDFCoalesce

2022-09-27 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-29432:
--
Component/s: Connectors / Hive

> Replace GenericUDFNvl with GenericUDFCoalesce
> -
>
> Key: FLINK-29432
> URL: https://issues.apache.org/jira/browse/FLINK-29432
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Affects Versions: 1.15.2
>Reporter: Prabhu Joseph
>Priority: Major
>
> Hive NVL() function has many issues like 
> [HIVE-25193|https://issues.apache.org/jira/browse/HIVE-25193] and it is 
> retired [HIVE-20961|https://issues.apache.org/jira/browse/HIVE-20961]. Our 
> internal hive distribution has the fix for HIVE-20961. With this fix, Flink 
> Build is failing with below as there is no more GenericUDFNvl in Hive. This 
> needs to be replaced with GenericUDFCoalesce.
> {code}
> [INFO] 
> /codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserDefaultGraphWalker.java:
>  Recompile with -Xlint:unchecked for details.
> [INFO] -
> [ERROR] COMPILATION ERROR : 
> [INFO] -
> [ERROR] 
> /codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:[75,45]
>  cannot find symbol
>   symbol:   class GenericUDFNvl
>   location: package org.apache.hadoop.hive.ql.udf.generic
> [ERROR] 
> /codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:[1216,41]
>  cannot find symbol
>   symbol:   class GenericUDFNvl
>   location: class 
> org.apache.flink.table.planner.delegation.hive.HiveParserTypeCheckProcFactory.DefaultExprProcessor
> [ERROR] 
> /codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserSemanticAnalyzer.java:[231,26]
>  constructor GlobalLimitCtx in class 
> org.apache.hadoop.hive.ql.parse.GlobalLimitCtx cannot be applied to given 
> types;
>   required: org.apache.hadoop.hive.conf.HiveConf
>   found: no arguments
>   reason: actual and formal argument lists differ in length
> [INFO] 3 errors
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29432) Replace GenericUDFNvl with GenericUDFCoalesce

2022-09-27 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-29432:
-

 Summary: Replace GenericUDFNvl with GenericUDFCoalesce
 Key: FLINK-29432
 URL: https://issues.apache.org/jira/browse/FLINK-29432
 Project: Flink
  Issue Type: Improvement
Reporter: Prabhu Joseph


Hive NVL() function has many issues like 
[HIVE-25193|https://issues.apache.org/jira/browse/HIVE-25193] and it is retired 
[HIVE-20961|https://issues.apache.org/jira/browse/HIVE-20961]. Our internal 
hive distribution has the fix for HIVE-20961. With this fix, Flink Build is 
failing with below as there is no more GenericUDFNvl in Hive. This needs to be 
replaced with GenericUDFCoalesce.

```
[INFO] 
/codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserDefaultGraphWalker.java:
 Recompile with -Xlint:unchecked for details.
[INFO] -
[ERROR] COMPILATION ERROR : 
[INFO] -

[ERROR] 
/codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:[75,45]
 cannot find symbol
  symbol:   class GenericUDFNvl
  location: package org.apache.hadoop.hive.ql.udf.generic
[ERROR] 
/codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:[1216,41]
 cannot find symbol
  symbol:   class GenericUDFNvl
  location: class 
org.apache.flink.table.planner.delegation.hive.HiveParserTypeCheckProcFactory.DefaultExprProcessor
[ERROR] 
/codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserSemanticAnalyzer.java:[231,26]
 constructor GlobalLimitCtx in class 
org.apache.hadoop.hive.ql.parse.GlobalLimitCtx cannot be applied to given types;
  required: org.apache.hadoop.hive.conf.HiveConf
  found: no arguments
  reason: actual and formal argument lists differ in length
[INFO] 3 errors
```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29432) Replace GenericUDFNvl with GenericUDFCoalesce

2022-09-27 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-29432:
--
Description: 
Hive NVL() function has many issues like 
[HIVE-25193|https://issues.apache.org/jira/browse/HIVE-25193] and it is retired 
[HIVE-20961|https://issues.apache.org/jira/browse/HIVE-20961]. Our internal 
hive distribution has the fix for HIVE-20961. With this fix, Flink Build is 
failing with below as there is no more GenericUDFNvl in Hive. This needs to be 
replaced with GenericUDFCoalesce.

{code}
[INFO] 
/codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserDefaultGraphWalker.java:
 Recompile with -Xlint:unchecked for details.
[INFO] -
[ERROR] COMPILATION ERROR : 
[INFO] -

[ERROR] 
/codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:[75,45]
 cannot find symbol
  symbol:   class GenericUDFNvl
  location: package org.apache.hadoop.hive.ql.udf.generic
[ERROR] 
/codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:[1216,41]
 cannot find symbol
  symbol:   class GenericUDFNvl
  location: class 
org.apache.flink.table.planner.delegation.hive.HiveParserTypeCheckProcFactory.DefaultExprProcessor
[ERROR] 
/codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserSemanticAnalyzer.java:[231,26]
 constructor GlobalLimitCtx in class 
org.apache.hadoop.hive.ql.parse.GlobalLimitCtx cannot be applied to given types;
  required: org.apache.hadoop.hive.conf.HiveConf
  found: no arguments
  reason: actual and formal argument lists differ in length
[INFO] 3 errors
{code}

  was:
Hive NVL() function has many issues like 
[HIVE-25193|https://issues.apache.org/jira/browse/HIVE-25193] and it is retired 
[HIVE-20961|https://issues.apache.org/jira/browse/HIVE-20961]. Our internal 
hive distribution has the fix for HIVE-20961. With this fix, Flink Build is 
failing with below as there is no more GenericUDFNvl in Hive. This needs to be 
replaced with GenericUDFCoalesce.

```
[INFO] 
/codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserDefaultGraphWalker.java:
 Recompile with -Xlint:unchecked for details.
[INFO] -
[ERROR] COMPILATION ERROR : 
[INFO] -

[ERROR] 
/codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:[75,45]
 cannot find symbol
  symbol:   class GenericUDFNvl
  location: package org.apache.hadoop.hive.ql.udf.generic
[ERROR] 
/codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:[1216,41]
 cannot find symbol
  symbol:   class GenericUDFNvl
  location: class 
org.apache.flink.table.planner.delegation.hive.HiveParserTypeCheckProcFactory.DefaultExprProcessor
[ERROR] 
/codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserSemanticAnalyzer.java:[231,26]
 constructor GlobalLimitCtx in class 
org.apache.hadoop.hive.ql.parse.GlobalLimitCtx cannot be applied to given types;
  required: org.apache.hadoop.hive.conf.HiveConf
  found: no arguments
  reason: actual and formal argument lists differ in length
[INFO] 3 errors
```


> Replace GenericUDFNvl with GenericUDFCoalesce
> -
>
> Key: FLINK-29432
> URL: https://issues.apache.org/jira/browse/FLINK-29432
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.14.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Hive NVL() function has many issues like 
> [HIVE-25193|https://issues.apache.org/jira/browse/HIVE-25193] and it is 
> retired [HIVE-20961|https://issues.apache.org/jira/browse/HIVE-20961]. Our 
> internal hive distribution has the fix for HIVE-20961. With this fix, Flink 
> Build is failing with 

[jira] [Updated] (FLINK-29432) Replace GenericUDFNvl with GenericUDFCoalesce

2022-09-27 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-29432:
--
Affects Version/s: 1.15.2
   (was: 1.14.4)

> Replace GenericUDFNvl with GenericUDFCoalesce
> -
>
> Key: FLINK-29432
> URL: https://issues.apache.org/jira/browse/FLINK-29432
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.15.2
>Reporter: Prabhu Joseph
>Priority: Major
>
> Hive NVL() function has many issues like 
> [HIVE-25193|https://issues.apache.org/jira/browse/HIVE-25193] and it is 
> retired [HIVE-20961|https://issues.apache.org/jira/browse/HIVE-20961]. Our 
> internal hive distribution has the fix for HIVE-20961. With this fix, Flink 
> Build is failing with below as there is no more GenericUDFNvl in Hive. This 
> needs to be replaced with GenericUDFCoalesce.
> {code}
> [INFO] 
> /codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserDefaultGraphWalker.java:
>  Recompile with -Xlint:unchecked for details.
> [INFO] -
> [ERROR] COMPILATION ERROR : 
> [INFO] -
> [ERROR] 
> /codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:[75,45]
>  cannot find symbol
>   symbol:   class GenericUDFNvl
>   location: package org.apache.hadoop.hive.ql.udf.generic
> [ERROR] 
> /codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:[1216,41]
>  cannot find symbol
>   symbol:   class GenericUDFNvl
>   location: class 
> org.apache.flink.table.planner.delegation.hive.HiveParserTypeCheckProcFactory.DefaultExprProcessor
> [ERROR] 
> /codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserSemanticAnalyzer.java:[231,26]
>  constructor GlobalLimitCtx in class 
> org.apache.hadoop.hive.ql.parse.GlobalLimitCtx cannot be applied to given 
> types;
>   required: org.apache.hadoop.hive.conf.HiveConf
>   found: no arguments
>   reason: actual and formal argument lists differ in length
> [INFO] 3 errors
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29432) Replace GenericUDFNvl with GenericUDFCoalesce

2022-09-27 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-29432:
--
Affects Version/s: 1.14.4

> Replace GenericUDFNvl with GenericUDFCoalesce
> -
>
> Key: FLINK-29432
> URL: https://issues.apache.org/jira/browse/FLINK-29432
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.14.4
>Reporter: Prabhu Joseph
>Priority: Major
>
> Hive NVL() function has many issues like 
> [HIVE-25193|https://issues.apache.org/jira/browse/HIVE-25193] and it is 
> retired [HIVE-20961|https://issues.apache.org/jira/browse/HIVE-20961]. Our 
> internal hive distribution has the fix for HIVE-20961. With this fix, Flink 
> Build is failing with below as there is no more GenericUDFNvl in Hive. This 
> needs to be replaced with GenericUDFCoalesce.
> ```
> [INFO] 
> /codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserDefaultGraphWalker.java:
>  Recompile with -Xlint:unchecked for details.
> [INFO] -
> [ERROR] COMPILATION ERROR : 
> [INFO] -
> [ERROR] 
> /codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:[75,45]
>  cannot find symbol
>   symbol:   class GenericUDFNvl
>   location: package org.apache.hadoop.hive.ql.udf.generic
> [ERROR] 
> /codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/HiveParserTypeCheckProcFactory.java:[1216,41]
>  cannot find symbol
>   symbol:   class GenericUDFNvl
>   location: class 
> org.apache.flink.table.planner.delegation.hive.HiveParserTypeCheckProcFactory.DefaultExprProcessor
> [ERROR] 
> /codebuild/output/src366217558/src/build/flink/rpm/BUILD/flink-1.15.2/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserSemanticAnalyzer.java:[231,26]
>  constructor GlobalLimitCtx in class 
> org.apache.hadoop.hive.ql.parse.GlobalLimitCtx cannot be applied to given 
> types;
>   required: org.apache.hadoop.hive.conf.HiveConf
>   found: no arguments
>   reason: actual and formal argument lists differ in length
> [INFO] 3 errors
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28406) Build fails on retrieving org.pentaho:pentaho-aggdesigner-algorithm

2022-07-05 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-28406:
--
Affects Version/s: 1.15.0
   (was: 1.15)

> Build fails on retrieving org.pentaho:pentaho-aggdesigner-algorithm
> ---
>
> Key: FLINK-28406
> URL: https://issues.apache.org/jira/browse/FLINK-28406
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> Build fails at flink-connector-hive_2.12 and flink-sql-client on retrieving 
> org.pentaho:pentaho-aggdesigner-algorithm from the dependency hive-exec.
> Adding same fix as 
> [FLINK-16432|https://github.com/apache/flink/pull/11316#issuecomment-600969333]
>  to both flink-connector-hive_2.12 and flink-sql-client fixes the issue.
> *Build Failure at flink-connector-hive_2.12*
> {code}
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  04:22 min (Wall Clock)
> [INFO] Finished at: 2022-07-05T12:26:17Z
> [INFO] 
> 
> [ERROR] Failed to execute goal on project flink-connector-hive_2.12: Could 
> not resolve dependencies for project 
> org.apache.flink:flink-connector-hive_2.12:jar:1.16-SNAPSHOT: Failed to 
> collect dependencies at org.apache.hive:hive-exec:jar:2.3.9 -> 
> org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Failed to read 
> artifact descriptor for 
> org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Could not transfer 
> artifact org.pentaho:pentaho-aggdesigner-algorithm:pom:5.1.5-jhyde from/to 
> maven-default-http-blocker (http://0.0.0.0/): Blocked mirror for 
> repositories: [repository.jboss.org 
> (http://repository.jboss.org/nexus/content/groups/public/, default, 
> disabled), conjars (http://conjars.org/repo, default, releases+snapshots), 
> apache.snapshots (http://repository.apache.org/snapshots, default, 
> snapshots)] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> [ERROR] 
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :flink-connector-hive_2.12
> {code}
> *Build Failure at flink-sql-client*
> {code}
> [ERROR] Failed to execute goal on project flink-sql-client: Could not resolve 
> dependencies for project org.apache.flink:flink-sql-client:jar:1.16-SNAPSHOT: 
> Failed to collect dependencies at org.apache.hive:hive-exec:jar:2.3.9 -> 
> org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Failed to read 
> artifact descriptor for 
> org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Could not transfer 
> artifact org.pentaho:pentaho-aggdesigner-algorithm:pom:5.1.5-jhyde from/to 
> maven-default-http-blocker (http://0.0.0.0/): Blocked mirror for 
> repositories: [repository.jboss.org 
> (http://repository.jboss.org/nexus/content/groups/public/, default, 
> disabled), conjars (http://conjars.org/repo, default, releases+snapshots), 
> apache.snapshots (http://repository.apache.org/snapshots, default, 
> snapshots)] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> [ERROR] 
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :flink-sql-client
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28406) Build fails on retrieving org.pentaho:pentaho-aggdesigner-algorithm

2022-07-05 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-28406:
--
Affects Version/s: 1.15

> Build fails on retrieving org.pentaho:pentaho-aggdesigner-algorithm
> ---
>
> Key: FLINK-28406
> URL: https://issues.apache.org/jira/browse/FLINK-28406
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.15
>Reporter: Prabhu Joseph
>Priority: Major
>
> Build fails at flink-connector-hive_2.12 and flink-sql-client on retrieving 
> org.pentaho:pentaho-aggdesigner-algorithm from the dependency hive-exec.
> Adding same fix as 
> [FLINK-16432|https://github.com/apache/flink/pull/11316#issuecomment-600969333]
>  to both flink-connector-hive_2.12 and flink-sql-client fixes the issue.
> *Build Failure at flink-connector-hive_2.12*
> {code}
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  04:22 min (Wall Clock)
> [INFO] Finished at: 2022-07-05T12:26:17Z
> [INFO] 
> 
> [ERROR] Failed to execute goal on project flink-connector-hive_2.12: Could 
> not resolve dependencies for project 
> org.apache.flink:flink-connector-hive_2.12:jar:1.16-SNAPSHOT: Failed to 
> collect dependencies at org.apache.hive:hive-exec:jar:2.3.9 -> 
> org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Failed to read 
> artifact descriptor for 
> org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Could not transfer 
> artifact org.pentaho:pentaho-aggdesigner-algorithm:pom:5.1.5-jhyde from/to 
> maven-default-http-blocker (http://0.0.0.0/): Blocked mirror for 
> repositories: [repository.jboss.org 
> (http://repository.jboss.org/nexus/content/groups/public/, default, 
> disabled), conjars (http://conjars.org/repo, default, releases+snapshots), 
> apache.snapshots (http://repository.apache.org/snapshots, default, 
> snapshots)] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> [ERROR] 
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :flink-connector-hive_2.12
> {code}
> *Build Failure at flink-sql-client*
> {code}
> [ERROR] Failed to execute goal on project flink-sql-client: Could not resolve 
> dependencies for project org.apache.flink:flink-sql-client:jar:1.16-SNAPSHOT: 
> Failed to collect dependencies at org.apache.hive:hive-exec:jar:2.3.9 -> 
> org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Failed to read 
> artifact descriptor for 
> org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Could not transfer 
> artifact org.pentaho:pentaho-aggdesigner-algorithm:pom:5.1.5-jhyde from/to 
> maven-default-http-blocker (http://0.0.0.0/): Blocked mirror for 
> repositories: [repository.jboss.org 
> (http://repository.jboss.org/nexus/content/groups/public/, default, 
> disabled), conjars (http://conjars.org/repo, default, releases+snapshots), 
> apache.snapshots (http://repository.apache.org/snapshots, default, 
> snapshots)] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> [ERROR] 
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :flink-sql-client
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28406) Build fails on retrieving org.pentaho:pentaho-aggdesigner-algorithm

2022-07-05 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-28406:
--
Description: 
Build fails at flink-connector-hive_2.12 and flink-sql-client on retrieving 
org.pentaho:pentaho-aggdesigner-algorithm from the dependency hive-exec.

Adding same fix as 
[FLINK-16432|https://github.com/apache/flink/pull/11316#issuecomment-600969333] 
to both flink-connector-hive_2.12 and flink-sql-client fixes the issue.


*Build Failure at flink-connector-hive_2.12*

{code}

[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time:  04:22 min (Wall Clock)
[INFO] Finished at: 2022-07-05T12:26:17Z
[INFO] 
[ERROR] Failed to execute goal on project flink-connector-hive_2.12: Could not 
resolve dependencies for project 
org.apache.flink:flink-connector-hive_2.12:jar:1.16-SNAPSHOT: Failed to collect 
dependencies at org.apache.hive:hive-exec:jar:2.3.9 -> 
org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Failed to read 
artifact descriptor for 
org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Could not transfer 
artifact org.pentaho:pentaho-aggdesigner-algorithm:pom:5.1.5-jhyde from/to 
maven-default-http-blocker (http://0.0.0.0/): Blocked mirror for repositories: 
[repository.jboss.org 
(http://repository.jboss.org/nexus/content/groups/public/, default, disabled), 
conjars (http://conjars.org/repo, default, releases+snapshots), 
apache.snapshots (http://repository.apache.org/snapshots, default, snapshots)] 
-> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :flink-connector-hive_2.12

{code}

*Build Failure at flink-sql-client*

{code}

[ERROR] Failed to execute goal on project flink-sql-client: Could not resolve 
dependencies for project org.apache.flink:flink-sql-client:jar:1.16-SNAPSHOT: 
Failed to collect dependencies at org.apache.hive:hive-exec:jar:2.3.9 -> 
org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Failed to read 
artifact descriptor for 
org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Could not transfer 
artifact org.pentaho:pentaho-aggdesigner-algorithm:pom:5.1.5-jhyde from/to 
maven-default-http-blocker (http://0.0.0.0/): Blocked mirror for repositories: 
[repository.jboss.org 
(http://repository.jboss.org/nexus/content/groups/public/, default, disabled), 
conjars (http://conjars.org/repo, default, releases+snapshots), 
apache.snapshots (http://repository.apache.org/snapshots, default, snapshots)] 
-> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :flink-sql-client

{code}

  was:
Build fails at flink-connector-hive_2.12 and flink-sql-client on retrieving 
org.pentaho:pentaho-aggdesigner-algorithm from the dependency hive-exec.

Adding same fix as 
[FLINK-16432|https://github.com/apache/flink/pull/11316#issuecomment-600969333] 
to both flink-connector-hive_2.12 and flink-sql-client fixes the issue.


*Build Failure:*

{code}

[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time:  04:22 min (Wall Clock)
[INFO] Finished at: 2022-07-05T12:26:17Z
[INFO] 
[ERROR] Failed to execute goal on project flink-connector-hive_2.12: Could not 
resolve dependencies for project 
org.apache.flink:flink-connector-hive_2.12:jar:1.16-SNAPSHOT: Failed to collect 
dependencies at org.apache.hive:hive-exec:jar:2.3.9 -> 
org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Failed to read 
artifact descriptor for 
org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Could not transfer 
artifact org.pentaho:pentaho-aggdesigner-algorithm:pom:5.1.5-jhyde from/to 
maven-default-http-blocker (http://0.0.0.0/): Blocked mirror for repositories: 
[repository.jboss.org 
(http://repository.jboss.org/nexus/content/groups/public/, default, disabled), 
conjars (http://conjars.org/repo, default, releases+snapshots), 

[jira] [Created] (FLINK-28406) Build fails on retrieving org.pentaho:pentaho-aggdesigner-algorithm

2022-07-05 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-28406:
-

 Summary: Build fails on retrieving 
org.pentaho:pentaho-aggdesigner-algorithm
 Key: FLINK-28406
 URL: https://issues.apache.org/jira/browse/FLINK-28406
 Project: Flink
  Issue Type: Bug
Reporter: Prabhu Joseph


Build fails at flink-connector-hive_2.12 and flink-sql-client on retrieving 
org.pentaho:pentaho-aggdesigner-algorithm from the dependency hive-exec.

Adding same fix as 
[FLINK-16432|https://github.com/apache/flink/pull/11316#issuecomment-600969333] 
to both flink-connector-hive_2.12 and flink-sql-client fixes the issue.


*Build Failure:*

{code}

[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time:  04:22 min (Wall Clock)
[INFO] Finished at: 2022-07-05T12:26:17Z
[INFO] 
[ERROR] Failed to execute goal on project flink-connector-hive_2.12: Could not 
resolve dependencies for project 
org.apache.flink:flink-connector-hive_2.12:jar:1.16-SNAPSHOT: Failed to collect 
dependencies at org.apache.hive:hive-exec:jar:2.3.9 -> 
org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Failed to read 
artifact descriptor for 
org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Could not transfer 
artifact org.pentaho:pentaho-aggdesigner-algorithm:pom:5.1.5-jhyde from/to 
maven-default-http-blocker (http://0.0.0.0/): Blocked mirror for repositories: 
[repository.jboss.org 
(http://repository.jboss.org/nexus/content/groups/public/, default, disabled), 
conjars (http://conjars.org/repo, default, releases+snapshots), 
apache.snapshots (http://repository.apache.org/snapshots, default, snapshots)] 
-> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :flink-connector-hive_2.12



[ERROR] Failed to execute goal on project flink-sql-client: Could not resolve 
dependencies for project org.apache.flink:flink-sql-client:jar:1.16-SNAPSHOT: 
Failed to collect dependencies at org.apache.hive:hive-exec:jar:2.3.9 -> 
org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Failed to read 
artifact descriptor for 
org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Could not transfer 
artifact org.pentaho:pentaho-aggdesigner-algorithm:pom:5.1.5-jhyde from/to 
maven-default-http-blocker (http://0.0.0.0/): Blocked mirror for repositories: 
[repository.jboss.org 
(http://repository.jboss.org/nexus/content/groups/public/, default, disabled), 
conjars (http://conjars.org/repo, default, releases+snapshots), 
apache.snapshots (http://repository.apache.org/snapshots, default, snapshots)] 
-> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :flink-sql-client

{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28216) Hadoop S3FileSystemFactory does not honor fs.s3.impl

2022-06-29 Thread Prabhu Joseph (Jira)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Prabhu Joseph commented on  FLINK-28216  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Hadoop S3FileSystemFactory does not honor fs.s3.impl   
 

  
 
 
 
 

 
 I think it is good idea as this will also provide isolation of dependencies of EMRFS and Hadoop S3 FileSystem. Will work on this and update you.   
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)  
 
 

 
   
 

  
 

  
 

   



[jira] [Comment Edited] (FLINK-28216) Hadoop S3FileSystemFactory does not honor fs.s3.impl

2022-06-28 Thread Prabhu Joseph (Jira)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Prabhu Joseph edited a comment on  FLINK-28216  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Hadoop S3FileSystemFactory does not honor fs.s3.impl   
 

  
 
 
 
 

 
 [~martijnvisser] I was checking the fallback option but it doesn't have the features like Entropy Injection and RecoverableWriter which FlinkS3FileSystem (directly supported FileSystem on top of S3AFileSystem) provides. If FlinkS3FileSystem would be on top of configured fs.s3.impl (EMRFS),  would  could  get both Flink and EMRFS provided features.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)  
 
 

 
   
 

  
 

  
 

   



[jira] [Commented] (FLINK-28216) Hadoop S3FileSystemFactory does not honor fs.s3.impl

2022-06-28 Thread Prabhu Joseph (Jira)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Prabhu Joseph commented on  FLINK-28216  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Hadoop S3FileSystemFactory does not honor fs.s3.impl   
 

  
 
 
 
 

 
 Martijn Visser I was checking the fallback option but it doesn't have the features like Entropy Injection and RecoverableWriter which FlinkS3FileSystem (directly supported FileSystem on top of S3AFileSystem) provides. If FlinkS3FileSystem would be on top of configured fs.s3.impl (EMRFS), would get both Flink and EMRFS provided features.  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)  
 
 

 
   
 

  
 

  
 

   



[jira] [Commented] (FLINK-28231) Support Apache Ozone FileSystem

2022-06-24 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558406#comment-17558406
 ] 

Prabhu Joseph commented on FLINK-28231:
---

Thanks [~martijnvisser].

> Support Apache Ozone FileSystem
> ---
>
> Key: FLINK-28231
> URL: https://issues.apache.org/jira/browse/FLINK-28231
> Project: Flink
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 1.12.1
> Environment: * Flink1.12.1
>  * Hadoop3.2.2
>  * Ozone1.2.1
>Reporter: bianqi
>Priority: Major
> Attachments: abc.png, image-2022-06-24-10-05-42-193.png, 
> image-2022-06-24-10-08-26-010.png, mapreduce-webui.png, 
> 微信图片_20220623224142.png
>
>
> After the ozone environment is currently configured, mapreduce tasks can be 
> submitted to read and write ozone, but flink tasks cannot be submitted
> The error is as follows
> {code:java}
> [root@jykj0 ozone-fs-hadoop]# flink run 
> /soft/flink13/examples/batch/WordCount.jar --input 
> ofs://jykj0.yarn.com/volume/bucket/warehouse/input --output 
> ofs://jykj0.yarn.com/volume/bucket/warehouse/output/
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/soft/flink-1.13.5/lib/log4j-slf4j-impl-2.16.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/soft/hadoop-3.2.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2022-06-23 22:30:24,131 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli   
>              [] - Found Yarn properties file under 
> /tmp/flink-1.13.5/.yarn-properties-root.
> 2022-06-23 22:30:24,131 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli   
>              [] - Found Yarn properties file under 
> /tmp/flink-1.13.5/.yarn-properties-root.
> 2022-06-23 22:30:27,151 INFO  
> org.apache.hadoop.io.retry.RetryInvocationHandler            [] - 
> java.lang.IllegalStateException, while invoking $Proxy27.submitRequest over 
> nodeId=null,nodeAddress=jykj0.yarn.com:9862 after 1 failover attempts. Trying 
> to failover after sleeping for 4000ms.
> 2022-06-23 22:30:31,152 INFO  
> org.apache.hadoop.io.retry.RetryInvocationHandler            [] - 
> java.lang.IllegalStateException, while invoking $Proxy27.submitRequest over 
> nodeId=null,nodeAddress=jykj0.yarn.com:9862 after 2 failover attempts. Trying 
> to failover after sleeping for 6000ms. 
> {code}
> Maybe there is something wrong with my configuration,Hope to update the 
> documentation
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/filesystems/overview/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-28231) Support Apache Ozone FileSystem

2022-06-24 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558383#comment-17558383
 ] 

Prabhu Joseph commented on FLINK-28231:
---

[~martijnvisser] Could you please make me a contributor. 

[~bianqi] Do you want to work on this. If not, i would like to take this up.

> Support Apache Ozone FileSystem
> ---
>
> Key: FLINK-28231
> URL: https://issues.apache.org/jira/browse/FLINK-28231
> Project: Flink
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 1.12.1
> Environment: * Flink1.12.1
>  * Hadoop3.2.2
>  * Ozone1.2.1
>Reporter: bianqi
>Priority: Major
> Attachments: abc.png, image-2022-06-24-10-05-42-193.png, 
> image-2022-06-24-10-08-26-010.png, mapreduce-webui.png, 
> 微信图片_20220623224142.png
>
>
> After the ozone environment is currently configured, mapreduce tasks can be 
> submitted to read and write ozone, but flink tasks cannot be submitted
> The error is as follows
> {code:java}
> [root@jykj0 ozone-fs-hadoop]# flink run 
> /soft/flink13/examples/batch/WordCount.jar --input 
> ofs://jykj0.yarn.com/volume/bucket/warehouse/input --output 
> ofs://jykj0.yarn.com/volume/bucket/warehouse/output/
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/soft/flink-1.13.5/lib/log4j-slf4j-impl-2.16.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/soft/hadoop-3.2.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2022-06-23 22:30:24,131 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli   
>              [] - Found Yarn properties file under 
> /tmp/flink-1.13.5/.yarn-properties-root.
> 2022-06-23 22:30:24,131 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli   
>              [] - Found Yarn properties file under 
> /tmp/flink-1.13.5/.yarn-properties-root.
> 2022-06-23 22:30:27,151 INFO  
> org.apache.hadoop.io.retry.RetryInvocationHandler            [] - 
> java.lang.IllegalStateException, while invoking $Proxy27.submitRequest over 
> nodeId=null,nodeAddress=jykj0.yarn.com:9862 after 1 failover attempts. Trying 
> to failover after sleeping for 4000ms.
> 2022-06-23 22:30:31,152 INFO  
> org.apache.hadoop.io.retry.RetryInvocationHandler            [] - 
> java.lang.IllegalStateException, while invoking $Proxy27.submitRequest over 
> nodeId=null,nodeAddress=jykj0.yarn.com:9862 after 2 failover attempts. Trying 
> to failover after sleeping for 6000ms. 
> {code}
> Maybe there is something wrong with my configuration,Hope to update the 
> documentation
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/filesystems/overview/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-28231) Support Apache Ozone FileSystem

2022-06-23 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558153#comment-17558153
 ] 

Prabhu Joseph commented on FLINK-28231:
---

[~bianqi] The client fails to invoke the remote server (Ozone Manager) method. 
Could you check if any error in the Ozone Manager which runs at 
jykj0.yarn.com:9862. 

> Support Apache Ozone FileSystem
> ---
>
> Key: FLINK-28231
> URL: https://issues.apache.org/jira/browse/FLINK-28231
> Project: Flink
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 1.12.1
> Environment: * Flink1.12.1
>  * Hadoop3.2.2
>  * Ozone1.2.1
>Reporter: bianqi
>Priority: Major
> Attachments: 微信图片_20220623224142.png
>
>
> After the ozone environment is currently configured, mapreduce tasks can be 
> submitted to read and write ozone, but flink tasks cannot be submitted
> The error is as follows
> {code:java}
> [root@jykj0 ozone-fs-hadoop]# flink run 
> /soft/flink13/examples/batch/WordCount.jar --input 
> ofs://jykj0.yarn.com/volume/bucket/warehouse/input --output 
> ofs://jykj0.yarn.com/volume/bucket/warehouse/output/
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/soft/flink-1.13.5/lib/log4j-slf4j-impl-2.16.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/soft/hadoop-3.2.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2022-06-23 22:30:24,131 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli   
>              [] - Found Yarn properties file under 
> /tmp/flink-1.13.5/.yarn-properties-root.
> 2022-06-23 22:30:24,131 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli   
>              [] - Found Yarn properties file under 
> /tmp/flink-1.13.5/.yarn-properties-root.
> 2022-06-23 22:30:27,151 INFO  
> org.apache.hadoop.io.retry.RetryInvocationHandler            [] - 
> java.lang.IllegalStateException, while invoking $Proxy27.submitRequest over 
> nodeId=null,nodeAddress=jykj0.yarn.com:9862 after 1 failover attempts. Trying 
> to failover after sleeping for 4000ms.
> 2022-06-23 22:30:31,152 INFO  
> org.apache.hadoop.io.retry.RetryInvocationHandler            [] - 
> java.lang.IllegalStateException, while invoking $Proxy27.submitRequest over 
> nodeId=null,nodeAddress=jykj0.yarn.com:9862 after 2 failover attempts. Trying 
> to failover after sleeping for 6000ms. 
> {code}
> Maybe there is something wrong with my configuration,Hope to update the 
> documentation
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/filesystems/overview/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-28216) Hadoop S3FileSystemFactory does not honor fs.s3.impl

2022-06-23 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557856#comment-17557856
 ] 

Prabhu Joseph commented on FLINK-28216:
---

[~martijnvisser] EMR has their own 
[EMRFS|https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-fs.html] to 
access Amazon S3. Currently, EMR Hive, Spark and Presto uses EMRFS. But Flink 
has hardcoded to Hadoop S3AFileSystem. It does not check the fs.s3a.impl config 
from core-site.xml

{code}

  fs.s3.impl
  com.amazon.ws.emr.hadoop.fs.EmrFileSystem

{code}

bq. Is it not already possible, as documented on 
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/s3/
 ?

Yes it is possible to access S3 from Flink using S3AFileSystem as per above 
document but not using EMRFS or any other implementation.


> Hadoop S3FileSystemFactory does not honor fs.s3.impl
> 
>
> Key: FLINK-28216
> URL: https://issues.apache.org/jira/browse/FLINK-28216
> Project: Flink
>  Issue Type: Improvement
>  Components: FileSystems
>Affects Versions: 1.15.0
>Reporter: Prabhu Joseph
>Priority: Minor
>
> Currently Hadoop S3FileSystemFactory has hardcoded the S3 FileSystem 
> implementation to S3AFileSystem. It does not allow to configure any other 
> implementation specified in fs.s3.impl. Suggest to read the fs.s3.impl from 
> Hadoop Config loaded and use the same.
>  
> {code:java}
> @Override
> protected org.apache.hadoop.fs.FileSystem createHadoopFileSystem() {
> return new S3AFileSystem();
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-28216) Hadoop S3FileSystemFactory does not honor fs.s3.impl

2022-06-23 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17557834#comment-17557834
 ] 

Prabhu Joseph commented on FLINK-28216:
---

[~martijnvisser] Could you please make me a contributor. Would like to provide 
a patch for this Jira.

> Hadoop S3FileSystemFactory does not honor fs.s3.impl
> 
>
> Key: FLINK-28216
> URL: https://issues.apache.org/jira/browse/FLINK-28216
> Project: Flink
>  Issue Type: Improvement
>  Components: FileSystems
>Affects Versions: 1.15.0
>Reporter: Prabhu Joseph
>Priority: Minor
>
> Currently Hadoop S3FileSystemFactory has hardcoded the S3 FileSystem 
> implementation to S3AFileSystem. It does not allow to configure any other 
> implementation specified in fs.s3.impl. Suggest to read the fs.s3.impl from 
> Hadoop Config loaded and use the same.
>  
> {code:java}
> @Override
> protected org.apache.hadoop.fs.FileSystem createHadoopFileSystem() {
> return new S3AFileSystem();
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28216) Hadoop S3FileSystemFactory does not honor fs.s3.impl

2022-06-23 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-28216:
-

 Summary: Hadoop S3FileSystemFactory does not honor fs.s3.impl
 Key: FLINK-28216
 URL: https://issues.apache.org/jira/browse/FLINK-28216
 Project: Flink
  Issue Type: Improvement
  Components: FileSystems
Affects Versions: 1.15.0
Reporter: Prabhu Joseph


Currently Hadoop S3FileSystemFactory has hardcoded the S3 FileSystem 
implementation to S3AFileSystem. It does not allow to configure any other 
implementation specified in fs.s3.impl. Suggest to read the fs.s3.impl from 
Hadoop Config loaded and use the same.

 
{code:java}
@Override
protected org.apache.hadoop.fs.FileSystem createHadoopFileSystem() {
return new S3AFileSystem();
}{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-28112) Misleading error message when Hadoop S3FileSystem is used and not at classpath

2022-06-17 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1790#comment-1790
 ] 

Prabhu Joseph commented on FLINK-28112:
---

Yes you are right. But *S3 scheme is not directly supported by Flink* log 
message is a wrong information to user. Flink directly supports S3 scheme. 

Suggest to check if the scheme is part of directly supported filesystem or not, 
based on that log the message in the final else

{code}
  } else {
try {
fs = FALLBACK_FACTORY.create(uri);
} catch (UnsupportedFileSystemSchemeException e) {
throw new UnsupportedFileSystemSchemeException(
"Could not find a file system implementation for 
scheme '"
+ uri.getScheme()
+ "'. The scheme is not directly supported 
by Flink and no Hadoop file system to "
+ "support this scheme could be loaded. For 
a full list of supported file systems, "
+ "please see 
https://nightlies.apache.org/flink/flink-docs-stable/ops/filesystems/.;,
e);
}
}
{code}

> Misleading error message when Hadoop S3FileSystem is used and not at classpath
> --
>
> Key: FLINK-28112
> URL: https://issues.apache.org/jira/browse/FLINK-28112
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Prabhu Joseph
>Priority: Minor
>
> When Using Hadoop S3FileSystem which is not at classpath, below error message 
> is thrown *S3 scheme is not directly supported by Flink* which is misleading. 
> Actually it is one of the directly supported filesystems.
> {code}
> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: 
> Could not find a file system implementation for scheme 's3'. The scheme is 
> not directly supported by Flink and no Hadoop file system to support this 
> scheme could be loaded. For a full list of supported file systems, please see 
> https://nightlies.apache.org/flink/flink-docs-stable/ops/filesystems/.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28112) Misleading error message when Hadoop S3FileSystem is used and not at classpath

2022-06-17 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-28112:
-

 Summary: Misleading error message when Hadoop S3FileSystem is used 
and not at classpath
 Key: FLINK-28112
 URL: https://issues.apache.org/jira/browse/FLINK-28112
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.15.0
Reporter: Prabhu Joseph


When Using Hadoop S3FileSystem which is not at classpath, below error message 
is thrown *S3 scheme is not directly supported by Flink* which is misleading. 
Actually it is one of the directly supported filesystems.

{code}
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could 
not find a file system implementation for scheme 's3'. The scheme is not 
directly supported by Flink and no Hadoop file system to support this scheme 
could be loaded. For a full list of supported file systems, please see 
https://nightlies.apache.org/flink/flink-docs-stable/ops/filesystems/.
{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (FLINK-27967) Unable to inject entropy on s3 prefix when doing a save point in Flink

2022-06-17 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1735#comment-1735
 ] 

Prabhu Joseph edited comment on FLINK-27967 at 6/17/22 10:46 AM:
-

We identified this happening due to a configuration issue. Flink supports 
entropy injection only when FlinkS3FileSystem is used and not when Hadoop 
S3FileSystem is used. Adding below in flink-conf.yaml has fixed the issue

{code}
yarn.ship-files: /usr/lib/flink/opt/flink-s3-fs-hadoop-1.14.0.jar
#fs.allowed-fallback-filesystems: s3
{code}


was (Author: prabhu joseph):
We identified this to be a configuration issue. Flink supports entropy 
injection only when FlinkS3FileSystem is used and not when Hadoop S3FileSystem 
is used. Adding below in flink-conf.yaml has fixed the issue

{code}
yarn.ship-files: /usr/lib/flink/opt/flink-s3-fs-hadoop-1.14.0.jar
#fs.allowed-fallback-filesystems: s3
{code}

> Unable to inject entropy on s3 prefix when doing a save point in Flink
> --
>
> Key: FLINK-27967
> URL: https://issues.apache.org/jira/browse/FLINK-27967
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.14.0
>Reporter: Vinay Devadiga
>Priority: Major
> Attachments: Screenshot 2022-06-09 at 12.10.51 PM.png
>
>
> Hi while using flink 1.14.0,I ran the example job 
> /examples/streaming/StateMachineExample.jar and the job submitted 
> successfully and then I tried to save point it to s3 with entropy enabled but 
> the entropy key was not respected here are my configurations.Can anyone 
> please guide me through issue, I an trying to go through the code but could 
> not find anything sustainable.
> |flink-conf|fs.allowed-fallback-filesystems|s3|Cluster configuration| |
> |flink-conf|execution.checkpointing.unaligned|true|Cluster configuration| |
> |flink-conf|state.backend.incremental|true|Cluster configuration| |
> |flink-conf|execution.checkpointing.timeout|600min|Cluster configuration| |
> |flink-conf|execution.checkpointing.externalized-checkpoint-retention|RETAIN_ON_CANCELLATION|Cluster
>  configuration| |
> |flink-conf|state.backend|rocksdb|Cluster configuration| |
> |flink-conf|s3.entropy.key|_entropy_|Cluster configuration| |
> |flink-conf|state.checkpoints.dir|s3://vinaydevuswest2/_entropy_/flink/checkpoint-data/|Cluster
>  configuration| |
> |flink-conf|execution.checkpointing.max-concurrent-checkpoints|1|Cluster 
> configuration| |
> |flink-conf|execution.checkpointing.min-pause|5000|Cluster configuration| |
> |flink-conf|execution.checkpointing.checkpoints-after-tasks-finish.enabled|true|Cluster
>  configuration| |
> |flink-conf|state.savepoints.dir|s3://vinaydevuswest2/_entropy_/flink/savepoint-data/|Cluster
>  configuration| |
> |flink-conf|state.storage.fs.memory-threshold|0|Cluster configuration| |
> |flink-conf|s3.entropy.length|4|Cluster configuration| |
> |flink-conf|execution.checkpointing.tolerable-failed-checkpoints|30|Cluster 
> configuration| |
> |flink-conf|execution.checkpointing.mode|EXACTLY_ONCE|Cluster configuration|



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27967) Unable to inject entropy on s3 prefix when doing a save point in Flink

2022-06-17 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1735#comment-1735
 ] 

Prabhu Joseph commented on FLINK-27967:
---

We identified this to be a configuration issue. Flink supports entropy 
injection only when FlinkS3FileSystem is used and not when Hadoop S3FileSystem 
is used. Adding below in flink-conf.yaml has fixed the issue

{code}
yarn.ship-files: /usr/lib/flink/opt/flink-s3-fs-hadoop-1.14.0.jar
#fs.allowed-fallback-filesystems: s3
{code}

> Unable to inject entropy on s3 prefix when doing a save point in Flink
> --
>
> Key: FLINK-27967
> URL: https://issues.apache.org/jira/browse/FLINK-27967
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.14.0
>Reporter: Vinay Devadiga
>Priority: Major
> Attachments: Screenshot 2022-06-09 at 12.10.51 PM.png
>
>
> Hi while using flink 1.14.0,I ran the example job 
> /examples/streaming/StateMachineExample.jar and the job submitted 
> successfully and then I tried to save point it to s3 with entropy enabled but 
> the entropy key was not respected here are my configurations.Can anyone 
> please guide me through issue, I an trying to go through the code but could 
> not find anything sustainable.
> |flink-conf|fs.allowed-fallback-filesystems|s3|Cluster configuration| |
> |flink-conf|execution.checkpointing.unaligned|true|Cluster configuration| |
> |flink-conf|state.backend.incremental|true|Cluster configuration| |
> |flink-conf|execution.checkpointing.timeout|600min|Cluster configuration| |
> |flink-conf|execution.checkpointing.externalized-checkpoint-retention|RETAIN_ON_CANCELLATION|Cluster
>  configuration| |
> |flink-conf|state.backend|rocksdb|Cluster configuration| |
> |flink-conf|s3.entropy.key|_entropy_|Cluster configuration| |
> |flink-conf|state.checkpoints.dir|s3://vinaydevuswest2/_entropy_/flink/checkpoint-data/|Cluster
>  configuration| |
> |flink-conf|execution.checkpointing.max-concurrent-checkpoints|1|Cluster 
> configuration| |
> |flink-conf|execution.checkpointing.min-pause|5000|Cluster configuration| |
> |flink-conf|execution.checkpointing.checkpoints-after-tasks-finish.enabled|true|Cluster
>  configuration| |
> |flink-conf|state.savepoints.dir|s3://vinaydevuswest2/_entropy_/flink/savepoint-data/|Cluster
>  configuration| |
> |flink-conf|state.storage.fs.memory-threshold|0|Cluster configuration| |
> |flink-conf|s3.entropy.length|4|Cluster configuration| |
> |flink-conf|execution.checkpointing.tolerable-failed-checkpoints|30|Cluster 
> configuration| |
> |flink-conf|execution.checkpointing.mode|EXACTLY_ONCE|Cluster configuration|



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-28102) Flink AkkaRpcSystemLoader fails when temporary directory is a symlink

2022-06-17 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17555455#comment-17555455
 ] 

Prabhu Joseph commented on FLINK-28102:
---

Yes setting io.tmp.dirs to the actual directory pointed by symlink worked. 
Shall we improve the logic to handle the symlink which points to Actual 
Directory case as well.

> Flink AkkaRpcSystemLoader fails when temporary directory is a symlink
> -
>
> Key: FLINK-28102
> URL: https://issues.apache.org/jira/browse/FLINK-28102
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / RPC
>Affects Versions: 1.15.0
>Reporter: Prabhu Joseph
>Priority: Minor
>
> Flink AkkaRpcSystemLoader fails when temporary directory is a symlink
> *Error Message:*
> {code}
> Caused by: java.nio.file.FileAlreadyExistsException: /tmp
> at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:88) 
> ~[?:1.8.0_332]
> at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) 
> ~[?:1.8.0_332]
> at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) 
> ~[?:1.8.0_332]
> at 
> sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
>  ~[?:1.8.0_332]
> at java.nio.file.Files.createDirectory(Files.java:674) ~[?:1.8.0_332]
> at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) 
> ~[?:1.8.0_332]
> at java.nio.file.Files.createDirectories(Files.java:727) 
> ~[?:1.8.0_332]
> at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcSystemLoader.loadRpcSystem(AkkaRpcSystemLoader.java:58)
>  ~[flink-dist-1.15.0.jar:1.15.0]
> at org.apache.flink.runtime.rpc.RpcSystem.load(RpcSystem.java:101) 
> ~[flink-dist-1.15.0.jar:1.15.0]
> at 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManagerRunnerServices(TaskManagerRunner.java:186)
>  ~[flink-dist-1.15.0.jar:1.15.0]
> at 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner.start(TaskManagerRunner.java:288)
>  ~[flink-dist-1.15.0.jar:1.15.0]
> at 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:481)
>  ~[flink-dist-1.15.0.jar:1.15.0]
> {code}
> *Repro:*
> {code}
> 1. /tmp is a symlink points to actual directory /mnt/tmp
> [root@prabhuHost log]# ls -lrt /tmp
> lrwxrwxrwx 1 root root 8 Jun 15 07:51 /tmp -> /mnt/tmp
> 2. Start Cluster
> ./bin/start-cluster.sh
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-28102) Flink AkkaRpcSystemLoader fails when temporary directory is a symlink

2022-06-16 Thread Prabhu Joseph (Jira)
Prabhu Joseph created FLINK-28102:
-

 Summary: Flink AkkaRpcSystemLoader fails when temporary directory 
is a symlink
 Key: FLINK-28102
 URL: https://issues.apache.org/jira/browse/FLINK-28102
 Project: Flink
  Issue Type: Bug
  Components: Runtime / RPC
Affects Versions: 1.15.0
Reporter: Prabhu Joseph


Flink AkkaRpcSystemLoader fails when temporary directory is a symlink

*Error Message:*
{code}
Caused by: java.nio.file.FileAlreadyExistsException: /tmp
at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:88) 
~[?:1.8.0_332]
at 
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) 
~[?:1.8.0_332]
at 
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) 
~[?:1.8.0_332]
at 
sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
 ~[?:1.8.0_332]
at java.nio.file.Files.createDirectory(Files.java:674) ~[?:1.8.0_332]
at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) 
~[?:1.8.0_332]
at java.nio.file.Files.createDirectories(Files.java:727) ~[?:1.8.0_332]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcSystemLoader.loadRpcSystem(AkkaRpcSystemLoader.java:58)
 ~[flink-dist-1.15.0.jar:1.15.0]
at org.apache.flink.runtime.rpc.RpcSystem.load(RpcSystem.java:101) 
~[flink-dist-1.15.0.jar:1.15.0]
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManagerRunnerServices(TaskManagerRunner.java:186)
 ~[flink-dist-1.15.0.jar:1.15.0]
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.start(TaskManagerRunner.java:288)
 ~[flink-dist-1.15.0.jar:1.15.0]
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:481)
 ~[flink-dist-1.15.0.jar:1.15.0]
{code}


*Repro:*

1. /tmp is a symlink points to actual directory /mnt/tmp

{code}
[root@prabhuHost log]# ls -lrt /tmp
lrwxrwxrwx 1 root root 8 Jun 15 07:51 /tmp -> /mnt/tmp
{code}

2. Start Cluster

./bin/start-cluster.sh






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (FLINK-28102) Flink AkkaRpcSystemLoader fails when temporary directory is a symlink

2022-06-16 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated FLINK-28102:
--
Description: 
Flink AkkaRpcSystemLoader fails when temporary directory is a symlink

*Error Message:*
{code}
Caused by: java.nio.file.FileAlreadyExistsException: /tmp
at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:88) 
~[?:1.8.0_332]
at 
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) 
~[?:1.8.0_332]
at 
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) 
~[?:1.8.0_332]
at 
sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
 ~[?:1.8.0_332]
at java.nio.file.Files.createDirectory(Files.java:674) ~[?:1.8.0_332]
at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) 
~[?:1.8.0_332]
at java.nio.file.Files.createDirectories(Files.java:727) ~[?:1.8.0_332]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcSystemLoader.loadRpcSystem(AkkaRpcSystemLoader.java:58)
 ~[flink-dist-1.15.0.jar:1.15.0]
at org.apache.flink.runtime.rpc.RpcSystem.load(RpcSystem.java:101) 
~[flink-dist-1.15.0.jar:1.15.0]
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManagerRunnerServices(TaskManagerRunner.java:186)
 ~[flink-dist-1.15.0.jar:1.15.0]
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.start(TaskManagerRunner.java:288)
 ~[flink-dist-1.15.0.jar:1.15.0]
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:481)
 ~[flink-dist-1.15.0.jar:1.15.0]
{code}


*Repro:*

{code}
1. /tmp is a symlink points to actual directory /mnt/tmp

[root@prabhuHost log]# ls -lrt /tmp
lrwxrwxrwx 1 root root 8 Jun 15 07:51 /tmp -> /mnt/tmp

2. Start Cluster
./bin/start-cluster.sh

{code}


  was:
Flink AkkaRpcSystemLoader fails when temporary directory is a symlink

*Error Message:*
{code}
Caused by: java.nio.file.FileAlreadyExistsException: /tmp
at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:88) 
~[?:1.8.0_332]
at 
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) 
~[?:1.8.0_332]
at 
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) 
~[?:1.8.0_332]
at 
sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
 ~[?:1.8.0_332]
at java.nio.file.Files.createDirectory(Files.java:674) ~[?:1.8.0_332]
at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) 
~[?:1.8.0_332]
at java.nio.file.Files.createDirectories(Files.java:727) ~[?:1.8.0_332]
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcSystemLoader.loadRpcSystem(AkkaRpcSystemLoader.java:58)
 ~[flink-dist-1.15.0.jar:1.15.0]
at org.apache.flink.runtime.rpc.RpcSystem.load(RpcSystem.java:101) 
~[flink-dist-1.15.0.jar:1.15.0]
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManagerRunnerServices(TaskManagerRunner.java:186)
 ~[flink-dist-1.15.0.jar:1.15.0]
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.start(TaskManagerRunner.java:288)
 ~[flink-dist-1.15.0.jar:1.15.0]
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:481)
 ~[flink-dist-1.15.0.jar:1.15.0]
{code}


*Repro:*

1. /tmp is a symlink points to actual directory /mnt/tmp

{code}
[root@prabhuHost log]# ls -lrt /tmp
lrwxrwxrwx 1 root root 8 Jun 15 07:51 /tmp -> /mnt/tmp
{code}

2. Start Cluster

./bin/start-cluster.sh





> Flink AkkaRpcSystemLoader fails when temporary directory is a symlink
> -
>
> Key: FLINK-28102
> URL: https://issues.apache.org/jira/browse/FLINK-28102
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / RPC
>Affects Versions: 1.15.0
>Reporter: Prabhu Joseph
>Priority: Minor
>
> Flink AkkaRpcSystemLoader fails when temporary directory is a symlink
> *Error Message:*
> {code}
> Caused by: java.nio.file.FileAlreadyExistsException: /tmp
> at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:88) 
> ~[?:1.8.0_332]
> at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) 
> ~[?:1.8.0_332]
> at 
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) 
> ~[?:1.8.0_332]
> at 
> sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
>  ~[?:1.8.0_332]
> at java.nio.file.Files.createDirectory(Files.java:674) ~[?:1.8.0_332]
> at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) 
> ~[?:1.8.0_332]
> at java.nio.file.Files.createDirectories(Files.java:727) 
> ~[?:1.8.0_332]
> at 
> 

  1   2   >