from:"Rong Rong \(JIRA\)"

[jira] [Reopened] (FLINK-16789) Retrieve JMXRMI information via REST API or WebUI

2020-08-27 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong reopened FLINK-16789:
---

> Retrieve JMXRMI information via REST API or WebUI
> -
>
> Key: FLINK-16789
> URL: https://issues.apache.org/jira/browse/FLINK-16789
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
>
> Currently there are no easy way to assign jmxrmi port to a running Flink job.
> The typical tutorial is to add the following to both TM and JM launch env:
> {code:java}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=
> -Dcom.sun.management.jmxremote.local.only=false
> {code}
> However, setting the jmxremote port to  is not usually a viable solution 
> when Flink job is running on a shared environment (YARN / K8s / etc).
> setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option 
> however, there's no easy way to retrieve such port assignment. We proposed to 
> use JMXConnectorServerFactory to explicitly establish a JMXServer inside 
> *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*.
> With the JMXServer explicitly created, we can return the JMXRMI information 
> via either REST API or WebUI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16789) Retrieve JMXRMI information via REST API or WebUI

2020-08-27 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-16789:
--
Summary: Retrieve JMXRMI information via REST API or WebUI  (was: Support 
JMX RMI random port assign & retrieval via JMXConnectorServer)

> Retrieve JMXRMI information via REST API or WebUI
> -
>
> Key: FLINK-16789
> URL: https://issues.apache.org/jira/browse/FLINK-16789
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
>
> Currently there are no easy way to assign jmxrmi port to a running Flink job.
> The typical tutorial is to add the following to both TM and JM launch env:
> {code:java}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=
> -Dcom.sun.management.jmxremote.local.only=false
> {code}
> However, setting the jmxremote port to  is not usually a viable solution 
> when Flink job is running on a shared environment (YARN / K8s / etc).
> setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option 
> however, there's no easy way to retrieve such port assignment. We proposed to 
> use JMXConnectorServerFactory to explicitly establish a JMXServer inside 
> *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*.
> With the JMXServer explicitly created, we can return the JMXRMI information 
> via either REST API or WebUI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16789) Support JMX RMI random port assign & retrieval via JMXConnectorServer

2020-08-27 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186045#comment-17186045
 ] 

Rong Rong commented on FLINK-16789:
---

oops. yes you are right [~chesnay]. the closed PR doesn't contain any WebUI / 
RestAPI changes. I will prepare another PR for exposing a REST API for JMX 
url/port retrieval. 

(FYI: our current approach is directly digging into the container startup log 
since it is printed there, that's why I forgot in the first place LOL)

> Support JMX RMI random port assign & retrieval via JMXConnectorServer
> -
>
> Key: FLINK-16789
> URL: https://issues.apache.org/jira/browse/FLINK-16789
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
>
> Currently there are no easy way to assign jmxrmi port to a running Flink job.
> The typical tutorial is to add the following to both TM and JM launch env:
> {code:java}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=
> -Dcom.sun.management.jmxremote.local.only=false
> {code}
> However, setting the jmxremote port to  is not usually a viable solution 
> when Flink job is running on a shared environment (YARN / K8s / etc).
> setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option 
> however, there's no easy way to retrieve such port assignment. We proposed to 
> use JMXConnectorServerFactory to explicitly establish a JMXServer inside 
> *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*.
> With the JMXServer explicitly created, we can return the JMXRMI information 
> via either REST API or WebUI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16789) Support JMX RMI random port assign & retrieval via JMXConnectorServer

2020-08-27 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186002#comment-17186002
 ] 

Rong Rong commented on FLINK-16789:
---

After discussion in [#13163|https://github.com/apache/flink/pull/13163], we 
decided to not support this feature. Although using port 0 is consider standard 
in JVM for getting random port (see: 
[https://docs.oracle.com/javase/7/docs/api/java/net/ServerSocket.html]) it is 
also very convenient for users to just directly configure a large port range 
after FLINK-5552 has been merged. 

 

Closing this ticket as won't fix.

> Support JMX RMI random port assign & retrieval via JMXConnectorServer
> -
>
> Key: FLINK-16789
> URL: https://issues.apache.org/jira/browse/FLINK-16789
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
>
> Currently there are no easy way to assign jmxrmi port to a running Flink job.
> The typical tutorial is to add the following to both TM and JM launch env:
> {code:java}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=
> -Dcom.sun.management.jmxremote.local.only=false
> {code}
> However, setting the jmxremote port to  is not usually a viable solution 
> when Flink job is running on a shared environment (YARN / K8s / etc).
> setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option 
> however, there's no easy way to retrieve such port assignment. We proposed to 
> use JMXConnectorServerFactory to explicitly establish a JMXServer inside 
> *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*.
> With the JMXServer explicitly created, we can return the JMXRMI information 
> via either REST API or WebUI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-16789) Support JMX RMI random port assign & retrieval via JMXConnectorServer

2020-08-27 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong closed FLINK-16789.
-
Resolution: Won't Fix

> Support JMX RMI random port assign & retrieval via JMXConnectorServer
> -
>
> Key: FLINK-16789
> URL: https://issues.apache.org/jira/browse/FLINK-16789
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
>
> Currently there are no easy way to assign jmxrmi port to a running Flink job.
> The typical tutorial is to add the following to both TM and JM launch env:
> {code:java}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=
> -Dcom.sun.management.jmxremote.local.only=false
> {code}
> However, setting the jmxremote port to  is not usually a viable solution 
> when Flink job is running on a shared environment (YARN / K8s / etc).
> setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option 
> however, there's no easy way to retrieve such port assignment. We proposed to 
> use JMXConnectorServerFactory to explicitly establish a JMXServer inside 
> *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*.
> With the JMXServer explicitly created, we can return the JMXRMI information 
> via either REST API or WebUI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16789) Support JMX RMI random port assign & retrieval via JMXConnectorServer

2020-08-16 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-16789:
--
Summary: Support JMX RMI random port assign & retrieval via 
JMXConnectorServer  (was: Support JMX RMI port retrieval via JMXConnectorServer)

> Support JMX RMI random port assign & retrieval via JMXConnectorServer
> -
>
> Key: FLINK-16789
> URL: https://issues.apache.org/jira/browse/FLINK-16789
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> Currently there are no easy way to assign jmxrmi port to a running Flink job.
> The typical tutorial is to add the following to both TM and JM launch env:
> {code:java}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=
> -Dcom.sun.management.jmxremote.local.only=false
> {code}
> However, setting the jmxremote port to  is not usually a viable solution 
> when Flink job is running on a shared environment (YARN / K8s / etc).
> setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option 
> however, there's no easy way to retrieve such port assignment. We proposed to 
> use JMXConnectorServerFactory to explicitly establish a JMXServer inside 
> *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*.
> With the JMXServer explicitly created, we can return the JMXRMI information 
> via either REST API or WebUI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-18045) Newest version reintroduced a bug causing not working on secured MapR

2020-06-02 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124277#comment-17124277
 ] 

Rong Rong commented on FLINK-18045:
---

yeah. that makes sense. +1 on the proposed solution #2

> Newest version reintroduced a bug causing not working on secured MapR
> -
>
> Key: FLINK-18045
> URL: https://issues.apache.org/jira/browse/FLINK-18045
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.10.1, 1.11.0
>Reporter: Bart Krasinski
>Assignee: Bart Krasinski
>Priority: Critical
> Fix For: 1.11.0, 1.10.2
>
>
> I was not able to run Flink 1.10.1 on YARN on a a secured MapR cluster, but 
> the previous version (1.10.0) works fine.
> After some investigation it looks like during some refactoring, checking if 
> the enabled security method is kerberos was removed, effectively 
> reintroducing https://issues.apache.org/jira/browse/FLINK-5949
>  
> Refactoring commit: 
> [https://github.com/apache/flink/commit/8751e69037d8a9b1756b75eed62a368c3ef29137]
>  
> My proposal would be to bring back the kerberos check:
> {code:java}
> loginUser.getAuthenticationMethod() == 
> UserGroupInformation.AuthenticationMethod.KERBEROS
> {code}
> and add an unit test for that case to prevent it from happening again
> I'm happy to prepare a PR after reaching consensus



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-18045) Newest version reintroduced a bug causing not working on secured MapR

2020-06-02 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17123885#comment-17123885
 ] 

Rong Rong commented on FLINK-18045:
---

Yes. I think specifically checking auth method as Kerberos is the right way. I 
think the proposal is to replace this line: 
https://github.com/apache/flink/commit/8751e69037d8a9b1756b75eed62a368c3ef29137#diff-3648957eaf615f89c12aab6ea0611b99R116
 correct? if so I think that works 

> Newest version reintroduced a bug causing not working on secured MapR
> -
>
> Key: FLINK-18045
> URL: https://issues.apache.org/jira/browse/FLINK-18045
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.10.1, 1.11.0
>Reporter: Bart Krasinski
>Assignee: Bart Krasinski
>Priority: Critical
> Fix For: 1.11.0, 1.10.2
>
>
> I was not able to run Flink 1.10.1 on YARN on a a secured MapR cluster, but 
> the previous version (1.10.0) works fine.
> After some investigation it looks like during some refactoring, checking if 
> the enabled security method is kerberos was removed, effectively 
> reintroducing https://issues.apache.org/jira/browse/FLINK-5949
>  
> Refactoring commit: 
> [https://github.com/apache/flink/commit/8751e69037d8a9b1756b75eed62a368c3ef29137]
>  
> My proposal would be to bring back the kerberos check:
> {code:java}
> loginUser.getAuthenticationMethod() == 
> UserGroupInformation.AuthenticationMethod.KERBEROS
> {code}
> and add an unit test for that case to prevent it from happening again
> I'm happy to prepare a PR after reaching consensus



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17386) Exception in HadoopSecurityContextFactory.createContext while no shaded-hadoop-lib provided.

2020-05-17 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-17386:
--
Component/s: Deployment / YARN

> Exception in HadoopSecurityContextFactory.createContext while no 
> shaded-hadoop-lib provided.
> 
>
> Key: FLINK-17386
> URL: https://issues.apache.org/jira/browse/FLINK-17386
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.11.0
>Reporter: Wenlong Lyu
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> java.io.IOException: Process execution failed due error. Error 
> output:java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.security.UserGroupInformation\n\tat 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory.createContext(HadoopSecurityContextFactory.java:59)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.installContext(SecurityUtils.java:92)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:60)\n\tat
>  org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:964)\n\n\tat 
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:144)\n\tat
>  
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:126)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.runSingleJobCompileCheck(VVRCompileTest.java:173)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.lambda$runJobsCompileCheck$0(VVRCompileTest.java:101)\n\tat
>  
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)\n\tat
>  java.lang.Thread.run(Thread.java:834)
> I think it is because exception throw in the static code block of 
> UserInformation, we should catch Throwable instead of Exception in 
> HadoopSecurityContextFactory#createContext?
> [~rongr] what do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-17386) Exception in HadoopSecurityContextFactory.createContext while no shaded-hadoop-lib provided.

2020-05-17 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-17386:
--
Affects Version/s: 1.11.0

> Exception in HadoopSecurityContextFactory.createContext while no 
> shaded-hadoop-lib provided.
> 
>
> Key: FLINK-17386
> URL: https://issues.apache.org/jira/browse/FLINK-17386
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.0
>Reporter: Wenlong Lyu
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> java.io.IOException: Process execution failed due error. Error 
> output:java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.security.UserGroupInformation\n\tat 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory.createContext(HadoopSecurityContextFactory.java:59)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.installContext(SecurityUtils.java:92)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:60)\n\tat
>  org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:964)\n\n\tat 
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:144)\n\tat
>  
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:126)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.runSingleJobCompileCheck(VVRCompileTest.java:173)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.lambda$runJobsCompileCheck$0(VVRCompileTest.java:101)\n\tat
>  
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)\n\tat
>  java.lang.Thread.run(Thread.java:834)
> I think it is because exception throw in the static code block of 
> UserInformation, we should catch Throwable instead of Exception in 
> HadoopSecurityContextFactory#createContext?
> [~rongr] what do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-17386) Exception in HadoopSecurityContextFactory.createContext while no shaded-hadoop-lib provided.

2020-05-17 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong closed FLINK-17386.
-
Fix Version/s: 1.11.0
   Resolution: Fixed

> Exception in HadoopSecurityContextFactory.createContext while no 
> shaded-hadoop-lib provided.
> 
>
> Key: FLINK-17386
> URL: https://issues.apache.org/jira/browse/FLINK-17386
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> java.io.IOException: Process execution failed due error. Error 
> output:java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.security.UserGroupInformation\n\tat 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory.createContext(HadoopSecurityContextFactory.java:59)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.installContext(SecurityUtils.java:92)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:60)\n\tat
>  org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:964)\n\n\tat 
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:144)\n\tat
>  
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:126)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.runSingleJobCompileCheck(VVRCompileTest.java:173)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.lambda$runJobsCompileCheck$0(VVRCompileTest.java:101)\n\tat
>  
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)\n\tat
>  java.lang.Thread.run(Thread.java:834)
> I think it is because exception throw in the static code block of 
> UserInformation, we should catch Throwable instead of Exception in 
> HadoopSecurityContextFactory#createContext?
> [~rongr] what do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17386) Exception in HadoopSecurityContextFactory.createContext while no shaded-hadoop-lib provided.

2020-05-17 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109703#comment-17109703
 ] 

Rong Rong commented on FLINK-17386:
---

closed via: 1892bedeea9fa118b6e3bcb572f63c2e7f6d83e3

> Exception in HadoopSecurityContextFactory.createContext while no 
> shaded-hadoop-lib provided.
> 
>
> Key: FLINK-17386
> URL: https://issues.apache.org/jira/browse/FLINK-17386
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
>
> java.io.IOException: Process execution failed due error. Error 
> output:java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.security.UserGroupInformation\n\tat 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory.createContext(HadoopSecurityContextFactory.java:59)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.installContext(SecurityUtils.java:92)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:60)\n\tat
>  org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:964)\n\n\tat 
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:144)\n\tat
>  
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:126)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.runSingleJobCompileCheck(VVRCompileTest.java:173)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.lambda$runJobsCompileCheck$0(VVRCompileTest.java:101)\n\tat
>  
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)\n\tat
>  java.lang.Thread.run(Thread.java:834)
> I think it is because exception throw in the static code block of 
> UserInformation, we should catch Throwable instead of Exception in 
> HadoopSecurityContextFactory#createContext?
> [~rongr] what do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-17386) Exception in HadoopSecurityContextFactory.createContext while no shaded-hadoop-lib provided.

2020-04-29 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094595#comment-17094595
 ] 

Rong Rong edited comment on FLINK-17386 at 4/29/20, 2:15 PM:
-

I just create a quick fix for this and attach a PR. If you could help testing 
the patch later that would be super great!. Thx



was (Author: rongr):
hmm. that's not what I expected. I have 2 theories:

1. in 1.10 impl the SecurityContext was override using the same logic. However, 
the difference is that the classloader used in 1.10 could've been the runtime 
classloader (SecurityUtils.class.getClassLoader());
while in 1.11 the classloader is actually the service provider classloader 
(HadoopSecurityContextFactory.class.getClassLoader())
  - this may be an issue but I am not exactly sure.

2. in 1.10 impl the Context installation catches {{LinkageError}} as well as 
ClassNotFoundExceptions. which is a much more strong catch term. and 
NoClassDefFoundError actually extends LinkageError. 

I am more convinced that the #2 is the root cause, would try to create a quick 
fix for this and attach a PR. If you could help testing the patch later that 
would be super great!.


> Exception in HadoopSecurityContextFactory.createContext while no 
> shaded-hadoop-lib provided.
> 
>
> Key: FLINK-17386
> URL: https://issues.apache.org/jira/browse/FLINK-17386
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
>
> java.io.IOException: Process execution failed due error. Error 
> output:java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.security.UserGroupInformation\n\tat 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory.createContext(HadoopSecurityContextFactory.java:59)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.installContext(SecurityUtils.java:92)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:60)\n\tat
>  org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:964)\n\n\tat 
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:144)\n\tat
>  
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:126)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.runSingleJobCompileCheck(VVRCompileTest.java:173)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.lambda$runJobsCompileCheck$0(VVRCompileTest.java:101)\n\tat
>  
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)\n\tat
>  java.lang.Thread.run(Thread.java:834)
> I think it is because exception throw in the static code block of 
> UserInformation, we should catch Throwable instead of Exception in 
> HadoopSecurityContextFactory#createContext?
> [~rongr] what do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-17386) Exception in HadoopSecurityContextFactory.createContext while no shaded-hadoop-lib provided.

2020-04-28 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong reassigned FLINK-17386:
-

Assignee: Rong Rong

> Exception in HadoopSecurityContextFactory.createContext while no 
> shaded-hadoop-lib provided.
> 
>
> Key: FLINK-17386
> URL: https://issues.apache.org/jira/browse/FLINK-17386
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Assignee: Rong Rong
>Priority: Major
>
> java.io.IOException: Process execution failed due error. Error 
> output:java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.security.UserGroupInformation\n\tat 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory.createContext(HadoopSecurityContextFactory.java:59)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.installContext(SecurityUtils.java:92)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:60)\n\tat
>  org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:964)\n\n\tat 
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:144)\n\tat
>  
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:126)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.runSingleJobCompileCheck(VVRCompileTest.java:173)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.lambda$runJobsCompileCheck$0(VVRCompileTest.java:101)\n\tat
>  
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)\n\tat
>  java.lang.Thread.run(Thread.java:834)
> I think it is because exception throw in the static code block of 
> UserInformation, we should catch Throwable instead of Exception in 
> HadoopSecurityContextFactory#createContext?
> [~rongr] what do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17386) Exception in HadoopSecurityContextFactory.createContext while no shaded-hadoop-lib provided.

2020-04-28 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094595#comment-17094595
 ] 

Rong Rong commented on FLINK-17386:
---

hmm. that's not what I expected. I have 2 theories:

1. in 1.10 impl the SecurityContext was override using the same logic. However, 
the difference is that the classloader used in 1.10 could've been the runtime 
classloader (SecurityUtils.class.getClassLoader());
while in 1.11 the classloader is actually the service provider classloader 
(HadoopSecurityContextFactory.class.getClassLoader())
  - this may be an issue but I am not exactly sure.

2. in 1.10 impl the Context installation catches {{LinkageError}} as well as 
ClassNotFoundExceptions. which is a much more strong catch term. and 
NoClassDefFoundError actually extends LinkageError. 

I am more convinced that the #2 is the root cause, would try to create a quick 
fix for this and attach a PR. If you could help testing the patch later that 
would be super great!.


> Exception in HadoopSecurityContextFactory.createContext while no 
> shaded-hadoop-lib provided.
> 
>
> Key: FLINK-17386
> URL: https://issues.apache.org/jira/browse/FLINK-17386
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Priority: Major
>
> java.io.IOException: Process execution failed due error. Error 
> output:java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.security.UserGroupInformation\n\tat 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory.createContext(HadoopSecurityContextFactory.java:59)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.installContext(SecurityUtils.java:92)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:60)\n\tat
>  org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:964)\n\n\tat 
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:144)\n\tat
>  
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:126)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.runSingleJobCompileCheck(VVRCompileTest.java:173)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.lambda$runJobsCompileCheck$0(VVRCompileTest.java:101)\n\tat
>  
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)\n\tat
>  java.lang.Thread.run(Thread.java:834)
> I think it is because exception throw in the static code block of 
> UserInformation, we should catch Throwable instead of Exception in 
> HadoopSecurityContextFactory#createContext?
> [~rongr] what do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17386) Exception in HadoopSecurityContextFactory.createContext while no shaded-hadoop-lib provided.

2020-04-27 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094152#comment-17094152
 ] 

Rong Rong commented on FLINK-17386:
---

I might have some idea on why this is happening. but just to be sure: is this 
issue occurring only to the latest unreleased flink? or does it also affects 
previous version (I bet it does, but if you can verify that would be great).



> Exception in HadoopSecurityContextFactory.createContext while no 
> shaded-hadoop-lib provided.
> 
>
> Key: FLINK-17386
> URL: https://issues.apache.org/jira/browse/FLINK-17386
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Priority: Major
>
> java.io.IOException: Process execution failed due error. Error 
> output:java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.security.UserGroupInformation\n\tat 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory.createContext(HadoopSecurityContextFactory.java:59)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.installContext(SecurityUtils.java:92)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:60)\n\tat
>  org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:964)\n\n\tat 
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:144)\n\tat
>  
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:126)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.runSingleJobCompileCheck(VVRCompileTest.java:173)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.lambda$runJobsCompileCheck$0(VVRCompileTest.java:101)\n\tat
>  
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)\n\tat
>  java.lang.Thread.run(Thread.java:834)
> I think it is because exception throw in the static code block of 
> UserInformation, we should catch Throwable instead of Exception in 
> HadoopSecurityContextFactory#createContext?
> [~rongr] what do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17386) Exception in HadoopSecurityContextFactory.createContext while no shaded-hadoop-lib provided.

2020-04-27 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094149#comment-17094149
 ] 

Rong Rong commented on FLINK-17386:
---

hmm. interesting. 
Just to understand more clearer, you are actually using some sort of hadoop 
functionality but not {{flink-shaded-hadoop-*}} module. correct? that causes 
Flink to falsely think it should use HadoopSecurityContext but it shouldn't. 

 however the exception said:

{code:java}
java.io.IOException: Process execution failed due error. Error 
output:java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.hadoop.security.UserGroupInformation
org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory.createContext(HadoopSecurityContextFactory.java:59)
...
{code}

so this means you are somehow including hadoop but not the hadoop-common? 


> Exception in HadoopSecurityContextFactory.createContext while no 
> shaded-hadoop-lib provided.
> 
>
> Key: FLINK-17386
> URL: https://issues.apache.org/jira/browse/FLINK-17386
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Priority: Major
>
> java.io.IOException: Process execution failed due error. Error 
> output:java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.security.UserGroupInformation\n\tat 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory.createContext(HadoopSecurityContextFactory.java:59)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.installContext(SecurityUtils.java:92)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:60)\n\tat
>  org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:964)\n\n\tat 
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:144)\n\tat
>  
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:126)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.runSingleJobCompileCheck(VVRCompileTest.java:173)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.lambda$runJobsCompileCheck$0(VVRCompileTest.java:101)\n\tat
>  
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)\n\tat
>  java.lang.Thread.run(Thread.java:834)
> I think it is because exception throw in the static code block of 
> UserInformation, we should catch Throwable instead of Exception in 
> HadoopSecurityContextFactory#createContext?
> [~rongr] what do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-17386) Exception in HadoopSecurityContextFactory.createContext while no shaded-hadoop-lib provided.

2020-04-26 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17092732#comment-17092732
 ] 

Rong Rong commented on FLINK-17386:
---

could you provide more information on how to reproduce this exception?

>From what I understand you do want to use shaded-hadoop-lib but somehow forgot 
>to include in the classpath. and your intention is to allow graceful shutdown 
>instead of a hard process exit. correct?

> Exception in HadoopSecurityContextFactory.createContext while no 
> shaded-hadoop-lib provided.
> 
>
> Key: FLINK-17386
> URL: https://issues.apache.org/jira/browse/FLINK-17386
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Priority: Major
>
> java.io.IOException: Process execution failed due error. Error 
> output:java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.hadoop.security.UserGroupInformation\n\tat 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory.createContext(HadoopSecurityContextFactory.java:59)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.installContext(SecurityUtils.java:92)\n\tat
>  
> org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:60)\n\tat
>  org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:964)\n\n\tat 
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:144)\n\tat
>  
> com.alibaba.flink.vvr.util.AutoClosableProcess$AutoClosableProcessBuilder.runBlocking(AutoClosableProcess.java:126)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.runSingleJobCompileCheck(VVRCompileTest.java:173)\n\tat
>  
> com.alibaba.flink.vvr.VVRCompileTest.lambda$runJobsCompileCheck$0(VVRCompileTest.java:101)\n\tat
>  
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)\n\tat
>  java.lang.Thread.run(Thread.java:834)
> I think it is because exception throw in the static code block of 
> UserInformation, we should catch Throwable instead of Exception in 
> HadoopSecurityContextFactory#createContext?
> [~rongr] what do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16733) Refactor YarnClusterDescriptor

2020-04-15 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17084212#comment-17084212
 ] 

Rong Rong commented on FLINK-16733:
---

+1 to see this going. BTW do we have a higher level plan for how would you 
proceed with the refactoring? 

> Refactor YarnClusterDescriptor
> --
>
> Key: FLINK-16733
> URL: https://issues.apache.org/jira/browse/FLINK-16733
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Minor
>
> Currently, YarnClusterDescriptor is not in a good shape. It has 1600+ lines 
> of codes, of which the method {{startAppMaster}} alone has 400+ codes, 
> leading to poor maintainability.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16522) Use type hints to declare the signature of the methods

2020-04-06 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076873#comment-17076873
 ] 

Rong Rong commented on FLINK-16522:
---

Thanks for the clarification. huge +1. looking forward to this :)

> Use type hints to declare the signature of the methods
> --
>
> Key: FLINK-16522
> URL: https://issues.apache.org/jira/browse/FLINK-16522
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Huang Xingbo
>Assignee: Huang Xingbo
>Priority: Major
> Fix For: 1.11.0
>
>
> [Type Hints|https://www.python.org/dev/peps/pep-0484/] was introduced in 
> Python 3.5 and it would be great if we can declare the signature of the 
> methods using type hints and introduce [type 
> check|https://realpython.com/python-type-checking/] in the python APIs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16522) Use type hints to declare the signature of the methods

2020-04-06 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-16522:
--
Description: [Type Hints|https://www.python.org/dev/peps/pep-0484/] was 
introduced in Python 3.5 and it would be great if we can declare the signature 
of the methods using type hints and introduce [type 
check|https://realpython.com/python-type-checking/] in the python APIs  (was: 
[Type Hints|https://www.python.org/dev/peps/pep-0484/] was introduced in Python 
3.5 and it would be great if we can declare the signature of the methods using 
type hints and introduce type check in the python APIs)

> Use type hints to declare the signature of the methods
> --
>
> Key: FLINK-16522
> URL: https://issues.apache.org/jira/browse/FLINK-16522
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Huang Xingbo
>Assignee: Huang Xingbo
>Priority: Major
> Fix For: 1.11.0
>
>
> [Type Hints|https://www.python.org/dev/peps/pep-0484/] was introduced in 
> Python 3.5 and it would be great if we can declare the signature of the 
> methods using type hints and introduce [type 
> check|https://realpython.com/python-type-checking/] in the python APIs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16522) Use type hints to declare the signature of the methods

2020-04-06 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-16522:
--
Description: [Type Hints|https://www.python.org/dev/peps/pep-0484/] was 
introduced in Python 3.5 and it would be great if we can declare the signature 
of the methods using type hints and introduce type check in the python APIs  
(was: Type Hints was introduced in Python 3.5 and it would be great if we can 
declare the signature of the methods using type hints.)

> Use type hints to declare the signature of the methods
> --
>
> Key: FLINK-16522
> URL: https://issues.apache.org/jira/browse/FLINK-16522
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Huang Xingbo
>Assignee: Huang Xingbo
>Priority: Major
> Fix For: 1.11.0
>
>
> [Type Hints|https://www.python.org/dev/peps/pep-0484/] was introduced in 
> Python 3.5 and it would be great if we can declare the signature of the 
> methods using type hints and introduce type check in the python APIs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16522) Use type hints to declare the signature of the methods

2020-04-01 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072854#comment-17072854
 ] 

Rong Rong commented on FLINK-16522:
---

Is this Jira referring to TypeChecking as well? 
[https://realpython.com/python-type-checking/] or just introducing Type Hints 
[https://www.python.org/dev/peps/pep-0484/] ?

with type checking/hints it might also help improvement the API documentations 
(not sure if there's any document auto-gen on Python side but at least now the 
function itself contains the Type Hints when looking them up)

IMO this is a great idea. +1 to have this implemented. 

> Use type hints to declare the signature of the methods
> --
>
> Key: FLINK-16522
> URL: https://issues.apache.org/jira/browse/FLINK-16522
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Huang Xingbo
>Assignee: Huang Xingbo
>Priority: Major
> Fix For: 1.11.0
>
>
> Type Hints was introduced in Python 3.5 and it would be great if we can 
> declare the signature of the methods using type hints.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-16789) Support JMX RMI port retrieval via JMXConnectorServer

2020-03-26 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067714#comment-17067714
 ] 

Rong Rong edited comment on FLINK-16789 at 3/26/20, 3:03 PM:
-

Yes. I agree. in fact the {{JMXServer}} class in the reporter can be directly 
reused. I would create a PR for the refactoring against FLINK-5552 to split out 
the JMXServer and put it in the cluster level, and then for this ticket I would 
only need to add the JMXRMI related components. 


was (Author: rongr):
Yes. I agree. in fact the {{JMXServer}} class in the reporter can be directly 
reused. I would create a PR for the refactoring against FLINK-5552 to split out 
the JMXServer and put it in the cluster level, and then I would only need to 
add the JMXRMI related components. 

> Support JMX RMI port retrieval via JMXConnectorServer
> -
>
> Key: FLINK-16789
> URL: https://issues.apache.org/jira/browse/FLINK-16789
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> Currently there are no easy way to assign jmxrmi port to a running Flink job.
> The typical tutorial is to add the following to both TM and JM launch env:
> {code:java}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=
> -Dcom.sun.management.jmxremote.local.only=false
> {code}
> However, setting the jmxremote port to  is not usually a viable solution 
> when Flink job is running on a shared environment (YARN / K8s / etc).
> setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option 
> however, there's no easy way to retrieve such port assignment. We proposed to 
> use JMXConnectorServerFactory to explicitly establish a JMXServer inside 
> *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*.
> With the JMXServer explicitly created, we can return the JMXRMI information 
> via either REST API or WebUI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16789) Support JMX RMI port retrieval via JMXConnectorServer

2020-03-26 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-16789:
--
Summary: Support JMX RMI port retrieval via JMXConnectorServer  (was: 
Support JMX RMI via JMXConnectorServer)

> Support JMX RMI port retrieval via JMXConnectorServer
> -
>
> Key: FLINK-16789
> URL: https://issues.apache.org/jira/browse/FLINK-16789
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> Currently there are no easy way to assign jmxrmi port to a running Flink job.
> The typical tutorial is to add the following to both TM and JM launch env:
> {code:java}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=
> -Dcom.sun.management.jmxremote.local.only=false
> {code}
> However, setting the jmxremote port to  is not usually a viable solution 
> when Flink job is running on a shared environment (YARN / K8s / etc).
> setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option 
> however, there's no easy way to retrieve such port assignment. We proposed to 
> use JMXConnectorServerFactory to explicitly establish a JMXServer inside 
> *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*.
> With the JMXServer explicitly created, we can return the JMXRMI information 
> via either REST API or WebUI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16789) Support JMX RMI via JMXConnectorServer

2020-03-26 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067714#comment-17067714
 ] 

Rong Rong commented on FLINK-16789:
---

Yes. I agree. in fact the {{JMXServer}} class in the reporter can be directly 
reused. I would create a PR for the refactoring against FLINK-5552 to split out 
the JMXServer and put it in the cluster level, and then I would only need to 
add the JMXRMI related components. 

> Support JMX RMI via JMXConnectorServer
> --
>
> Key: FLINK-16789
> URL: https://issues.apache.org/jira/browse/FLINK-16789
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> Currently there are no easy way to assign jmxrmi port to a running Flink job.
> The typical tutorial is to add the following to both TM and JM launch env:
> {code:java}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=
> -Dcom.sun.management.jmxremote.local.only=false
> {code}
> However, setting the jmxremote port to  is not usually a viable solution 
> when Flink job is running on a shared environment (YARN / K8s / etc).
> setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option 
> however, there's no easy way to retrieve such port assignment. We proposed to 
> use JMXConnectorServerFactory to explicitly establish a JMXServer inside 
> *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*.
> With the JMXServer explicitly created, we can return the JMXRMI information 
> via either REST API or WebUI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16789) Support JMX RMI via JMXConnectorServer

2020-03-25 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067343#comment-17067343
 ] 

Rong Rong commented on FLINK-16789:
---

I've share a code snippet in 
[https://github.com/apache/flink/compare/master...acbb9b26b6745274dbb33209334e8bdf9e919fea]
 --> this is far from ready as there are many config / testing involve. but 
gives the gist on what I think would work.

Please kindly take a look.

> Support JMX RMI via JMXConnectorServer
> --
>
> Key: FLINK-16789
> URL: https://issues.apache.org/jira/browse/FLINK-16789
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> Currently there are no easy way to assign jmxrmi port to a running Flink job.
> The typical tutorial is to add the following to both TM and JM launch env:
> {code:java}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=
> -Dcom.sun.management.jmxremote.local.only=false
> {code}
> However, setting the jmxremote port to  is not usually a viable solution 
> when Flink job is running on a shared environment (YARN / K8s / etc).
> setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option 
> however, there's no easy way to retrieve such port assignment. We proposed to 
> use JMXConnectorServerFactory to explicitly establish a JMXServer inside 
> *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*.
> With the JMXServer explicitly created, we can return the JMXRMI information 
> via either REST API or WebUI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Issue Comment Deleted] (FLINK-5552) Make the JMX port available through RESTful API

2020-03-25 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-5552:
-
Comment: was deleted

(was: Yes. I think having it in the log is possible. however, based on my 
experience. we need to configure extra logging options in order to make it work:
1. we use log4j / slf4j as our main logging mechanism (I think this is also one 
of the 2 default supported logger (logback/log4j))
2. com.sun.management.jmxremote.* uses java util logging instead. so we have to 
make extra effort: configuring a SLF4JBridgeHandler as part of the java util 
logging properties.
This is a bit cumbersome. 

Regardless. I think I figured out a way to start the JMX server - I push the 
code snippet in 
https://github.com/apache/flink/compare/master...acbb9b26b6745274dbb33209334e8bdf9e919fea
 --> this is far from ready as there are many config / testing involve. but 
gives the gist on what I think would work. 

Please kindly take a look.)

> Make the JMX port available through RESTful API
> ---
>
> Key: FLINK-5552
> URL: https://issues.apache.org/jira/browse/FLINK-5552
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Runtime / Web Frontend
>Reporter: david.wang
>Assignee: Rong Rong
>Priority: Major
>
> Currently, JMXReporter will create a server for JMX viewer retrieving JMX 
> stat. The port can be configured through configuration options, but for large 
> cluster with many machines running many Flink instances and other processes, 
> we cant set a fixed port to JMX server, making it difficult to get the JMX 
> port.
> This JIRA is to suggest adding an api at web frontend so that it is very easy 
> to get the JMX port for JM and TM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16789) Support JMX RMI via JMXConnectorServer

2020-03-25 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-16789:
--
Description: 
Currently there are no easy way to assign jmxrmi port to a running Flink job.

The typical tutorial is to add the following to both TM and JM launch env:
{code:java}
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=
-Dcom.sun.management.jmxremote.local.only=false
{code}
However, setting the jmxremote port to  is not usually a viable solution 
when Flink job is running on a shared environment (YARN / K8s / etc).

setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option however, 
there's no easy way to retrieve such port assignment. We proposed to use 
JMXConnectorServerFactory to explicitly establish a JMXServer inside 
*{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*.

With the JMXServer explicitly created, we can return the JMXRMI information via 
either REST API or WebUI.

  was:
Currently there are no easy way to assign jmxrmi port to a running Flink job. 

The typical tutorial is to add the following to both TM and JM launch env:
{code}
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=
-Dcom.sun.management.jmxremote.local.only=false
{code}
However, setting the jmxremote port to  is not usually a viable solution 
when Flink job is running on a shared environment (YARN / K8s / etc).

setting {{-Dcom.sun.management.jmxremote.port=0}} is the best option however, 
there's no easy way to retrieve such port assignment. We proposed to use 
JMXConnectorServerFactory to explicitly establish a JMXServer inside 
ClusterEntrypoint & TaskManagerRunner.




> Support JMX RMI via JMXConnectorServer
> --
>
> Key: FLINK-16789
> URL: https://issues.apache.org/jira/browse/FLINK-16789
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.11.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> Currently there are no easy way to assign jmxrmi port to a running Flink job.
> The typical tutorial is to add the following to both TM and JM launch env:
> {code:java}
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=
> -Dcom.sun.management.jmxremote.local.only=false
> {code}
> However, setting the jmxremote port to  is not usually a viable solution 
> when Flink job is running on a shared environment (YARN / K8s / etc).
> setting *{{-Dcom.sun.management.jmxremote.port=0}}* is the best option 
> however, there's no easy way to retrieve such port assignment. We proposed to 
> use JMXConnectorServerFactory to explicitly establish a JMXServer inside 
> *{{ClusterEntrypoint}}* & *{{TaskManagerRunner}}*.
> With the JMXServer explicitly created, we can return the JMXRMI information 
> via either REST API or WebUI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16789) Support JMX RMI via JMXConnectorServer

2020-03-25 Thread Rong Rong (Jira)

Rong Rong created FLINK-16789:
-

 Summary: Support JMX RMI via JMXConnectorServer
 Key: FLINK-16789
 URL: https://issues.apache.org/jira/browse/FLINK-16789
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Coordination, Runtime / Task
Affects Versions: 1.11.0
Reporter: Rong Rong
Assignee: Rong Rong


Currently there are no easy way to assign jmxrmi port to a running Flink job. 

The typical tutorial is to add the following to both TM and JM launch env:
{code}
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=
-Dcom.sun.management.jmxremote.local.only=false
{code}
However, setting the jmxremote port to  is not usually a viable solution 
when Flink job is running on a shared environment (YARN / K8s / etc).

setting {{-Dcom.sun.management.jmxremote.port=0}} is the best option however, 
there's no easy way to retrieve such port assignment. We proposed to use 
JMXConnectorServerFactory to explicitly establish a JMXServer inside 
ClusterEntrypoint & TaskManagerRunner.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-5552) Make the JMX port available through RESTful API

2020-03-25 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067341#comment-17067341
 ] 

Rong Rong commented on FLINK-5552:
--

Oops.. I just realized that I am not talking about the exact same issue. Yes 
you are right [~chesnay]. the log is actually available. 

I would create another ticket for the one I was talking about

> Make the JMX port available through RESTful API
> ---
>
> Key: FLINK-5552
> URL: https://issues.apache.org/jira/browse/FLINK-5552
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Runtime / Web Frontend
>Reporter: david.wang
>Assignee: Rong Rong
>Priority: Major
>
> Currently, JMXReporter will create a server for JMX viewer retrieving JMX 
> stat. The port can be configured through configuration options, but for large 
> cluster with many machines running many Flink instances and other processes, 
> we cant set a fixed port to JMX server, making it difficult to get the JMX 
> port.
> This JIRA is to suggest adding an api at web frontend so that it is very easy 
> to get the JMX port for JM and TM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16483) Add Python building blocks to make sure the basic functionality of vectorized Python UDF could work

2020-03-25 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067335#comment-17067335
 ] 

Rong Rong commented on FLINK-16483:
---

yeah. I took a look and seems like simply align the version works fine. so far 
I didn't run into any test issues. created 
https://github.com/apache/flink/pull/11517. will see how Azure/Travis goes

> Add Python building blocks to make sure the basic functionality of vectorized 
> Python UDF could work
> ---
>
> Key: FLINK-16483
> URL: https://issues.apache.org/jira/browse/FLINK-16483
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Reporter: Dian Fu
>Assignee: Dian Fu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The aim of this Jira is to add Python building blocks such as the coders, etc 
> to make sure the basic functionality of vectorized Python UDF could work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-5552) Make the JMX port available through RESTful API

2020-03-25 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong reassigned FLINK-5552:


Assignee: Rong Rong

> Make the JMX port available through RESTful API
> ---
>
> Key: FLINK-5552
> URL: https://issues.apache.org/jira/browse/FLINK-5552
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Runtime / Web Frontend
>Reporter: david.wang
>Assignee: Rong Rong
>Priority: Major
>
> Currently, JMXReporter will create a server for JMX viewer retrieving JMX 
> stat. The port can be configured through configuration options, but for large 
> cluster with many machines running many Flink instances and other processes, 
> we cant set a fixed port to JMX server, making it difficult to get the JMX 
> port.
> This JIRA is to suggest adding an api at web frontend so that it is very easy 
> to get the JMX port for JM and TM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-5552) Make the JMX port available through RESTful API

2020-03-25 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067236#comment-17067236
 ] 

Rong Rong edited comment on FLINK-5552 at 3/26/20, 12:00 AM:
-

Yes. I think having it in the log is possible. however, based on my experience. 
we need to configure extra logging options in order to make it work:
1. we use log4j / slf4j as our main logging mechanism (I think this is also one 
of the 2 default supported logger (logback/log4j))
2. com.sun.management.jmxremote.* uses java util logging instead. so we have to 
make extra effort: configuring a SLF4JBridgeHandler as part of the java util 
logging properties.
This is a bit cumbersome. 

Regardless. I think I figured out a way to start the JMX server - I push the 
code snippet in 
https://github.com/apache/flink/compare/master...acbb9b26b6745274dbb33209334e8bdf9e919fea
 --> this is far from ready as there are many config / testing involve. but 
gives the gist on what I think would work. 

Please kindly take a look.


was (Author: rongr):
Yes. I think having it in the log is possible. however, based on my experience. 
we need to configure extra logging options in order to make it work:
1. we use log4j / slf4j as our main logging mechanism (I think this is also one 
of the 2 default supported logger (logback/log4j))
2. com.sun.management.jmxremote.* uses java util logging instead. so we have to 
make extra effort:
  - configuring a SLF4JBridgeHandler as part of the java util logging 
properties.
This is a bit cumbersome. 

Regardless. I think I figured out a way to start the JMX server - I push the 
code snippet in 
https://github.com/apache/flink/compare/master...acbb9b26b6745274dbb33209334e8bdf9e919fea
 --> this is far from ready as there are many config / testing involve. but 
gives the gist on what I think would work. 

Please kindly take a look.

> Make the JMX port available through RESTful API
> ---
>
> Key: FLINK-5552
> URL: https://issues.apache.org/jira/browse/FLINK-5552
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Runtime / Web Frontend
>Reporter: david.wang
>Priority: Major
>
> Currently, JMXReporter will create a server for JMX viewer retrieving JMX 
> stat. The port can be configured through configuration options, but for large 
> cluster with many machines running many Flink instances and other processes, 
> we cant set a fixed port to JMX server, making it difficult to get the JMX 
> port.
> This JIRA is to suggest adding an api at web frontend so that it is very easy 
> to get the JMX port for JM and TM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-5552) Make the JMX port available through RESTful API

2020-03-25 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067236#comment-17067236
 ] 

Rong Rong commented on FLINK-5552:
--

Yes. I think having it in the log is possible. however, based on my experience. 
we need to configure extra logging options in order to make it work:
1. we use log4j / slf4j as our main logging mechanism (I think this is also one 
of the 2 default supported logger (logback/log4j))
2. com.sun.management.jmxremote.* uses java util logging instead. so we have to 
make extra effort:
  - configuring a SLF4JBridgeHandler as part of the java util logging 
properties.
This is a bit cumbersome. 

Regardless. I think I figured out a way to start the JMX server - I push the 
code snippet in 
https://github.com/apache/flink/compare/master...acbb9b26b6745274dbb33209334e8bdf9e919fea
 --> this is far from ready as there are many config / testing involve. but 
gives the gist on what I think would work. 

Please kindly take a look.

> Make the JMX port available through RESTful API
> ---
>
> Key: FLINK-5552
> URL: https://issues.apache.org/jira/browse/FLINK-5552
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Runtime / Web Frontend
>Reporter: david.wang
>Priority: Major
>
> Currently, JMXReporter will create a server for JMX viewer retrieving JMX 
> stat. The port can be configured through configuration options, but for large 
> cluster with many machines running many Flink instances and other processes, 
> we cant set a fixed port to JMX server, making it difficult to get the JMX 
> port.
> This JIRA is to suggest adding an api at web frontend so that it is very easy 
> to get the JMX port for JM and TM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16483) Add Python building blocks to make sure the basic functionality of vectorized Python UDF could work

2020-03-25 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067073#comment-17067073
 ] 

Rong Rong commented on FLINK-16483:
---

Hi Here, I am getting this error after rebasing this PR.

{code}
ERROR: apache-beam 2.19.0 has requirement pyarrow<0.16.0,>=0.15.1, but you'll 
have pyarrow 0.16.0 which is incompatible.
{code}

Is this intended?


> Add Python building blocks to make sure the basic functionality of vectorized 
> Python UDF could work
> ---
>
> Key: FLINK-16483
> URL: https://issues.apache.org/jira/browse/FLINK-16483
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Python
>Reporter: Dian Fu
>Assignee: Dian Fu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The aim of this Jira is to add Python building blocks such as the coders, etc 
> to make sure the basic functionality of vectorized Python UDF could work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-5552) Make the JMX port available through RESTful API

2020-03-23 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17064900#comment-17064900
 ] 

Rong Rong commented on FLINK-5552:
--

Hi, I would like to follow up on this ticket per some recent ML discussions. 

Any chance we can make it available in either LOG files, or static content file 
first? AFAIK, once the JMX port establishes, it will no longer change. It makes 
some sense to first surface the information - once we have the content ready we 
may be able to serve it out from REST API as static content. 

> Make the JMX port available through RESTful API
> ---
>
> Key: FLINK-5552
> URL: https://issues.apache.org/jira/browse/FLINK-5552
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Runtime / Web Frontend
>Reporter: david.wang
>Priority: Major
>
> Currently, JMXReporter will create a server for JMX viewer retrieving JMX 
> stat. The port can be configured through configuration options, but for large 
> cluster with many machines running many Flink instances and other processes, 
> we cant set a fixed port to JMX server, making it difficult to get the JMX 
> port.
> This JIRA is to suggest adding an api at web frontend so that it is very easy 
> to get the JMX port for JM and TM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-5552) Make the JMX port available through RESTful API

2020-03-23 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-5552:
-
Description: 
Currently, JMXReporter will create a server for JMX viewer retrieving JMX stat. 
The port can be configured through configuration options, but for large cluster 
with many machines running many Flink instances and other processes, we cant 
set a fixed port to JMX server, making it difficult to get the JMX port.

This JIRA is to suggest adding an api at web frontend so that it is very easy 
to get the JMX port for JM and TM.

  was:
Currently, JMXReporter will create a server for JMX viewer retrieving JMX stat. 
The port can be configured through configuration options, but for large cluster 
with many machines running many Flink instances and other processes, we can set 
a fixed port to JMX server, making it difficult to get the JMX port.

This JIRA is to suggest adding an api at web frontend so that it is very easy 
to get the JMX port for JM and TM.


> Make the JMX port available through RESTful API
> ---
>
> Key: FLINK-5552
> URL: https://issues.apache.org/jira/browse/FLINK-5552
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics, Runtime / Web Frontend
>Reporter: david.wang
>Priority: Major
>
> Currently, JMXReporter will create a server for JMX viewer retrieving JMX 
> stat. The port can be configured through configuration options, but for large 
> cluster with many machines running many Flink instances and other processes, 
> we cant set a fixed port to JMX server, making it difficult to get the JMX 
> port.
> This JIRA is to suggest adding an api at web frontend so that it is very easy 
> to get the JMX port for JM and TM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (FLINK-11271) Improvement to Kerberos Security

2020-03-19 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-11271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong resolved FLINK-11271.
---
Fix Version/s: 1.11.0
   Resolution: Fixed

Converted FLINK-16224 into a separate issue since it deals with Hadoop 
Delegation Token specifically and close this ticket as all subtasks are resolved

> Improvement to Kerberos Security
> 
>
> Key: FLINK-11271
> URL: https://issues.apache.org/jira/browse/FLINK-11271
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
> Fix For: 1.11.0
>
>
> This is the master JIRA for the improvement listed in:
> https://docs.google.com/document/d/1rBLCpyQKg6Ld2P0DEgv4VIOMTwv4sitd7h7P5r202IE/edit#heading=h.y34f96ctioqk



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16224) Refine Hadoop Delegation Token based testing framework

2020-03-19 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-16224:
--
Parent: (was: FLINK-11271)
Issue Type: Improvement  (was: Sub-task)

> Refine Hadoop Delegation Token based testing framework
> --
>
> Key: FLINK-16224
> URL: https://issues.apache.org/jira/browse/FLINK-16224
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> Currently the SecureTestEnvironment doesn't support Hadoop delegation token, 
> which makes the E2E testing of delegation-token-based YARN application 
> impossible.
> Propose to enhance the testing framework to support delegation token based 
> launch in YARN cluster



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16241) Remove the license and notice file in flink-ml-lib module on release-1.10 branch

2020-02-24 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043664#comment-17043664
 ] 

Rong Rong commented on FLINK-16241:
---

I think removing the license seems to be more reasonable.
Even if we decided to change how flink-ml is released in subsequence 1.10.x 
versions, we can always add them back.



> Remove the license and notice file in flink-ml-lib module on release-1.10 
> branch
> 
>
> Key: FLINK-16241
> URL: https://issues.apache.org/jira/browse/FLINK-16241
> Project: Flink
>  Issue Type: Bug
>  Components: Library / Machine Learning
>Affects Versions: 1.10.0
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Blocker
> Fix For: 1.10.1
>
>
> The jar of flink-ml-lib should not contain the license and notice file as it 
> actually does not bundle the related dependencies. We should remove these 
> file on branch release-1.10.
> BTW. The release-1.9 branch does not have this problem since the license and 
> notice are added in 1.10. And on master(1.11), we will bundle the 
> dependencies, so the license and notice file should be kept, see FLINK-15847.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16236) Fix YARNSessionFIFOSecuredITCase not loading the correct security context factory

2020-02-22 Thread Rong Rong (Jira)

Rong Rong created FLINK-16236:
-

 Summary: Fix YARNSessionFIFOSecuredITCase not loading the correct 
security context factory
 Key: FLINK-16236
 URL: https://issues.apache.org/jira/browse/FLINK-16236
 Project: Flink
  Issue Type: Sub-task
  Components: Deployment / YARN
Reporter: Rong Rong
Assignee: Rong Rong


Follow up on FLINK-11589. Currently due to the override of the 
TestHadoopModuleFactory, it is not loading the HadoopContextFactory due to the 
compatibility checker.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15561) Unify Kerberos credentials checking

2020-02-21 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15561:
--
Summary: Unify Kerberos credentials checking  (was: Improve Kerberos 
delegation token login )

> Unify Kerberos credentials checking
> ---
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.10.1, 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Inspired by the discussion in 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]
>  
> Currently the security HadoopModule handles delegation token login utilizes 2 
> different code path. 
> Flink needs to to ensure delegation token is also a valid format of 
> credential when launching YARN context. See [1] 
> [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
>  and [2] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (FLINK-15561) Improve Kerberos delegation token login

2020-02-21 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong resolved FLINK-15561.
---
Resolution: Fixed

> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.10.1, 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Inspired by the discussion in 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]
>  
> Currently the security HadoopModule handles delegation token login utilizes 2 
> different code path. 
> Flink needs to to ensure delegation token is also a valid format of 
> credential when launching YARN context. See [1] 
> [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
>  and [2] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15561) Improve Kerberos delegation token login

2020-02-21 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17042273#comment-17042273
 ] 

Rong Rong commented on FLINK-15561:
---

fixed:

master: 57c33961a55cff1068345198cb4669d9f1313bf8
release-1.10: 8751e69037d8a9b1756b75eed62a368c3ef29137


 

> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.10.1, 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Inspired by the discussion in 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]
>  
> Currently the security HadoopModule handles delegation token login utilizes 2 
> different code path. 
> Flink needs to to ensure delegation token is also a valid format of 
> credential when launching YARN context. See [1] 
> [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
>  and [2] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16224) Refine Hadoop Delegation Token based testing framework

2020-02-21 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-16224:
--
Description: 
Currently the SecureTestEnvironment doesn't support Hadoop delegation token, 
which makes the E2E testing of delegation-token-based YARN application 
impossible.

Propose to enhance the testing framework to support delegation token based 
launch in YARN cluster

  was:
Currently the SecureTestEnvironment doesn't support Hadoop delegation token, 
which makes the E2E testing of delegation-token-based YARN application 
impossible.

Propose to enhance the testing framework to support delegation token based 
launch.


> Refine Hadoop Delegation Token based testing framework
> --
>
> Key: FLINK-16224
> URL: https://issues.apache.org/jira/browse/FLINK-16224
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> Currently the SecureTestEnvironment doesn't support Hadoop delegation token, 
> which makes the E2E testing of delegation-token-based YARN application 
> impossible.
> Propose to enhance the testing framework to support delegation token based 
> launch in YARN cluster



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16224) Refine Hadoop Delegation Token based testing framework

2020-02-21 Thread Rong Rong (Jira)

Rong Rong created FLINK-16224:
-

 Summary: Refine Hadoop Delegation Token based testing framework
 Key: FLINK-16224
 URL: https://issues.apache.org/jira/browse/FLINK-16224
 Project: Flink
  Issue Type: Sub-task
  Components: Deployment / YARN
Reporter: Rong Rong
Assignee: Rong Rong


Currently the SecureTestEnvironment doesn't support Hadoop delegation token, 
which makes the E2E testing of delegation-token-based YARN application 
impossible.

Propose to enhance the testing framework to support delegation token based 
launch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15561) Improve Kerberos delegation token login

2020-02-21 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15561:
--
Fix Version/s: (was: 1.9.3)

> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.10.1, 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Inspired by the discussion in 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]
>  
> Currently the security HadoopModule handles delegation token login utilizes 2 
> different code path. 
> Flink needs to to ensure delegation token is also a valid format of 
> credential when launching YARN context. See [1] 
> [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
>  and [2] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15561) Improve Kerberos delegation token login

2020-02-21 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15561:
--
Description: 
Inspired by the discussion in 
[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]

 

Currently the security HadoopModule handles delegation token login utilizes 2 
different code path. 

Flink needs to to ensure delegation token is also a valid format of credential 
when launching YARN context. See [1] 
[https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
 and [2] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]

  was:
Inspired by the discussion in 
[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]

 

Currently the security HadoopModule handles delegation token login seems to be 
not working.

Some improvements including: spawning a delegation token renewal thread. See: 
[1] 
[https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84]
 
 and [2] 
[https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538]

Another is to ensure delegation token is also a valid format of credential when 
launching YARN context. See [1] 
[https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
 and [2] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]


> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.9.3, 1.10.1, 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Inspired by the discussion in 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]
>  
> Currently the security HadoopModule handles delegation token login utilizes 2 
> different code path. 
> Flink needs to to ensure delegation token is also a valid format of 
> credential when launching YARN context. See [1] 
> [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
>  and [2] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15561) Improve Kerberos delegation token login

2020-02-21 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15561:
--
Parent: FLINK-11271
Issue Type: Sub-task  (was: Bug)

> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.9.3, 1.10.1, 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Inspired by the discussion in 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]
>  
> Currently the security HadoopModule handles delegation token login seems to 
> be not working.
> Some improvements including: spawning a delegation token renewal thread. See: 
> [1] 
> [https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84]
>  
>  and [2] 
> [https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538]
> Another is to ensure delegation token is also a valid format of credential 
> when launching YARN context. See [1] 
> [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
>  and [2] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-11088) Allow pre-install Kerberos authentication keytab discovery on YARN

2020-02-21 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-11088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-11088:
--
Fix Version/s: 1.11.0

> Allow pre-install Kerberos authentication keytab discovery on YARN
> --
>
> Key: FLINK-11088
> URL: https://issues.apache.org/jira/browse/FLINK-11088
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently flink-yarn assumes keytab is shipped as application master 
> environment local resource on client side and will be distributed to all the 
> TMs. This does not work for YARN proxy user mode [1] since proxy user or 
> super user might not have access to actual users' keytab, but can request 
> delegation tokens on users' behalf. 
> Based on the type of security options for long-living YARN service[2], we 
> propose to have the keytab file path discovery configurable depending on the 
> launch mode of the YARN client. 
> Reference: 
> [1] 
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html
> [2] 
> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Securing_Long-lived_YARN_Services



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15561) Improve Kerberos delegation token login

2020-02-21 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15561:
--
Issue Type: Bug  (was: Improvement)

> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.9.3, 1.10.1, 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Inspired by the discussion in 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]
>  
> Currently the security HadoopModule handles delegation token login seems to 
> be not working.
> Some improvements including: spawning a delegation token renewal thread. See: 
> [1] 
> [https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84]
>  
>  and [2] 
> [https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538]
> Another is to ensure delegation token is also a valid format of credential 
> when launching YARN context. See [1] 
> [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
>  and [2] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15561) Improve Kerberos delegation token login

2020-02-21 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15561:
--
Fix Version/s: 1.10.1
   1.9.3

> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.9.3, 1.10.1, 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Inspired by the discussion in 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]
>  
> Currently the security HadoopModule handles delegation token login seems to 
> be not working.
> Some improvements including: spawning a delegation token renewal thread. See: 
> [1] 
> [https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84]
>  
>  and [2] 
> [https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538]
> Another is to ensure delegation token is also a valid format of credential 
> when launching YARN context. See [1] 
> [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
>  and [2] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15561) Improve Kerberos delegation token login

2020-02-21 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15561:
--
Fix Version/s: (was: 1.10.1)
   (was: 1.9.3)

> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Inspired by the discussion in 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]
>  
> Currently the security HadoopModule handles delegation token login seems to 
> be not working.
> Some improvements including: spawning a delegation token renewal thread. See: 
> [1] 
> [https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84]
>  
>  and [2] 
> [https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538]
> Another is to ensure delegation token is also a valid format of credential 
> when launching YARN context. See [1] 
> [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
>  and [2] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15561) Improve Kerberos delegation token login

2020-02-21 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15561:
--
Issue Type: Improvement  (was: Bug)

> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.9.3, 1.10.1, 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Inspired by the discussion in 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]
>  
> Currently the security HadoopModule handles delegation token login seems to 
> be not working.
> Some improvements including: spawning a delegation token renewal thread. See: 
> [1] 
> [https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84]
>  
>  and [2] 
> [https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538]
> Another is to ensure delegation token is also a valid format of credential 
> when launching YARN context. See [1] 
> [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
>  and [2] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15561) Improve Kerberos delegation token login

2020-01-31 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15561:
--
Description: 
Inspired by the discussion in 
[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]

 

Currently the security HadoopModule handles delegation token login seems to be 
not working.

Some improvements including: spawning a delegation token renewal thread. See: 
[1] 
[https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84]
 
 and [2] 
[https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538]

Another is to ensure delegation token is also a valid format of credential when 
launching YARN context. See [1] 
[https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
 and [2] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]

  was:
Inspired by the discussion in 
[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]

 

 

Currently the security HadoopModule handles delegation token login seems to be 
not working.

Some improvements including: spawning a delegation token renewal thread. See: 
[1] 
[https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84]
 
 and [2] 
[https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538]

Another is to ensure delegation token is also a valid format of credential when 
launching YARN context. See [1] 
[https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
 and [2] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]


> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Inspired by the discussion in 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]
>  
> Currently the security HadoopModule handles delegation token login seems to 
> be not working.
> Some improvements including: spawning a delegation token renewal thread. See: 
> [1] 
> [https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84]
>  
>  and [2] 
> [https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538]
> Another is to ensure delegation token is also a valid format of credential 
> when launching YARN context. See [1] 
> [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
>  and [2] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15561) Improve Kerberos delegation token login

2020-01-31 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15561:
--
Description: 
Inspired by the discussion in 
[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]

 

 

Currently the security HadoopModule handles delegation token login seems to be 
not working.

Some improvements including: spawning a delegation token renewal thread. See: 
[1] 
[https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84]
 
 and [2] 
[https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538]

Another is to ensure delegation token is also a valid format of credential when 
launching YARN context. See [1] 
[https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
 and [2] 
[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]

  was:
Currently the security HadoopModule handles delegation token login seems to be 
not working.

Some improvements including: spawning a delegation token renewal thread. See: 
[1] 
https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84
 
and [2] 
https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538

Another is to ensure delegation token is also a valid format of credential when 
launching YARN context. See [1] 
https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484
 and [2] 
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146


> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available, usability
> Fix For: 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Inspired by the discussion in 
> [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Yarn-Kerberos-issue-td31894.html#a31933]
>  
>  
> Currently the security HadoopModule handles delegation token login seems to 
> be not working.
> Some improvements including: spawning a delegation token renewal thread. See: 
> [1] 
> [https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84]
>  
>  and [2] 
> [https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538]
> Another is to ensure delegation token is also a valid format of credential 
> when launching YARN context. See [1] 
> [https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484]
>  and [2] 
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-22 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021129#comment-17021129
 ] 

Rong Rong commented on FLINK-15447:
---

Oops. i think i completely misunderstood your intention [~victor-wong] 
regarding "\{{PWD}}" . You meant the printed working directory of the YARN 
TM/JM's JVM process running in, correct? I thought you meant to create a new 
YarnOption configuration key. 

Yes I think this could work!

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-21 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020757#comment-17020757
 ] 

Rong Rong commented on FLINK-15447:
---

Thanks [~victor-wong] for the explanation. I think I understood the problem 
much better now.

This is definitely a good question to address, one suggestion I have is: can we 
put the goal / intent in the description. I think based on the discussion the 
summary of this Jira can be :

Title: To improve utilization of the `java.io.tmpdir` for YARN module

Description: To achieve:
_1) Tasks can utilize all disks when using tmp_
_2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) 
is done._

utilizing a fully flexible {{$PWD/tmp}} path is one of the solution but it also 
run into issues that [~fly_in_gis] mentions as {{$PWD}} can be anything.

One thing I can think of is instead of letting user customize $PWD, we preset 
the location of {{tmpdir}} to be relative to the YARN container dir root, 
something like {{$CLUSTER_CONTAINER_DEFAULT_DIR_ROOT/$PWD_USER_DEFINE}}?? what 
do you guys think?

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15671) Provide one place for Docker Images with all supported cluster modes

2020-01-21 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020506#comment-17020506
 ] 

Rong Rong commented on FLINK-15671:
---

Huge +1 on this. 

Thanks [~sewen] for bringing this to attention! 

> Provide one place for Docker Images with all supported cluster modes
> 
>
> Key: FLINK-15671
> URL: https://issues.apache.org/jira/browse/FLINK-15671
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Docker
>Reporter: Stephan Ewen
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
>
> Currently, there are different places where Docker Images for Flink exist:
> * https://github.com/docker-flink
> * https://github.com/apache/flink/tree/master/flink-container/docker
> * https://github.com/apache/flink/tree/master/flink-contrib/docker-flink
> The different Dockerfiles in the different places support different modes, 
> for example some support session clusters while other support the 
> "self-contained application mode" (per job mode).
> There should be one single place to defined images that cover both session 
> and application mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-14460) Active Kubernetes integration phase2 - Advanced Features

2020-01-21 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020505#comment-17020505
 ] 

Rong Rong edited comment on FLINK-14460 at 1/21/20 7:21 PM:


Thanks for sharing [~fly_in_gis].
 * FLINK-15671 seems like a GREAT idea!! huge +1 on this.

 * Regarding the documentation of the E2E:
 1. I was actually having trouble running the quick instruction (it seems like 
the default image "flink:latest" was broken) which I have to dig into all the 
documents to find a solution - thus prompts me to think we can do some 
improvement. (such as fix a particular docker image tag)
 2. We probably need to also provide a job-cluster E2E (I think this is a good 
start: 
[https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/kubernetes.html#flink-job-cluster-on-kubernetes]
 but we could definitely unified them and make it more easy to follow)

 * Regarding add-ons:
 Yes that's what I meant. I actually find some add-ons (like the 
[Dashboard|https://github.com/kubernetes/dashboard#kubernetes-dashboard]) very 
useful when trying the native K8S out. 
 However on second thought I am not sure this fits in Flink's documentation, so 
let's leave it out of the discussion, what do you think?


was (Author: rongr):
Thanks for sharing [~fly_in_gis].
 * FLINK-15671 seems like a GREAT idea!! huge +1 on this.

 * Regarding the documentation of the E2E:
 1. I was actually having trouble running the quick instruction (it seems like 
the default image "flink:latest" was broken) which I have to dig into all the 
documents to find a solution - thus prompts me to think we can do some 
improvement. 
 2. We probably need to also provide a job-cluster E2E (I think this is a good 
start: 
[https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/kubernetes.html#flink-job-cluster-on-kubernetes]
 but we could definitely unified them and make it more easy to follow)

 * Regarding add-ons:
 Yes that's what I meant. I actually find some add-ons (like the 
[Dashboard|https://github.com/kubernetes/dashboard#kubernetes-dashboard]) very 
useful when trying the native K8S out. 
 However on second thought I am not sure this fits in Flink's documentation, so 
let's leave it out of the discussion, what do you think?

> Active Kubernetes integration phase2 - Advanced Features
> 
>
> Key: FLINK-14460
> URL: https://issues.apache.org/jira/browse/FLINK-14460
> Project: Flink
>  Issue Type: New Feature
>  Components: Deployment / Kubernetes
>Reporter: Yang Wang
>Priority: Major
>
> This is phase2 of active kubernetes integration. It is a umbrella jira to 
> track all the advanced features and make Flink on Kubernetes production ready.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14460) Active Kubernetes integration phase2 - Advanced Features

2020-01-21 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020505#comment-17020505
 ] 

Rong Rong commented on FLINK-14460:
---

Thanks for sharing [~fly_in_gis].
 * FLINK-15671 seems like a GREAT idea!! huge +1 on this.

 * Regarding the documentation of the E2E:
 1. I was actually having trouble running the quick instruction (it seems like 
the default image "flink:latest" was broken) which I have to dig into all the 
documents to find a solution - thus prompts me to think we can do some 
improvement. 
 2. We probably need to also provide a job-cluster E2E (I think this is a good 
start: 
[https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/kubernetes.html#flink-job-cluster-on-kubernetes]
 but we could definitely unified them and make it more easy to follow)

 * Regarding add-ons:
 Yes that's what I meant. I actually find some add-ons (like the 
[Dashboard|https://github.com/kubernetes/dashboard#kubernetes-dashboard]) very 
useful when trying the native K8S out. 
 However on second thought I am not sure this fits in Flink's documentation, so 
let's leave it out of the discussion, what do you think?

> Active Kubernetes integration phase2 - Advanced Features
> 
>
> Key: FLINK-14460
> URL: https://issues.apache.org/jira/browse/FLINK-14460
> Project: Flink
>  Issue Type: New Feature
>  Components: Deployment / Kubernetes
>Reporter: Yang Wang
>Priority: Major
>
> This is phase2 of active kubernetes integration. It is a umbrella jira to 
> track all the advanced features and make Flink on Kubernetes production ready.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-14460) Active Kubernetes integration phase2 - Advanced Features

2020-01-20 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019642#comment-17019642
 ] 

Rong Rong edited comment on FLINK-14460 at 1/20/20 5:13 PM:


Thanks [~fly_in_gis] for driving this. Really looking forward to native Flink 
on K8S. I briefly went through the document. I was wondering: should we have a 
JIRA to capture the work to "refine the documentation" ? One think I love to 
have is:
 * create an E2E "quick start" sections in the documentation? one for Job 
cluster and one for Session cluster that users can successfully start running 
without read through the remaining parts.

Also we might be able to add an advance usage section, some of the ideas I have 
are: how to create custom session cluster Flink images; what are the suggested 
add-ons we can run alongside Flink session cluster? ...

What do you think?


was (Author: rongr):
Thanks [~fly_in_gis] for driving this. Really looking forward to native Flink 
on K8S. I briefly went through the document. I was wondering: should we have a 
JIRA log the work to "refine the documentation" ? One think I love to have is:
 * create an E2E "quick start" sections in the documentation? one for Job 
cluster and one for Session cluster that users can successfully start running 
without read through the remaining parts.

Also we might be able to add an advance usage section, some of the ideas I have 
are: how to create custom session cluster Flink images; what are the suggested 
add-ons we can run alongside Flink session cluster? ...

What do you think?

> Active Kubernetes integration phase2 - Advanced Features
> 
>
> Key: FLINK-14460
> URL: https://issues.apache.org/jira/browse/FLINK-14460
> Project: Flink
>  Issue Type: New Feature
>  Components: Deployment / Kubernetes
>Reporter: Yang Wang
>Priority: Major
>
> This is phase2 of active kubernetes integration. It is a umbrella jira to 
> track all the advanced features and make Flink on Kubernetes production ready.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14460) Active Kubernetes integration phase2 - Advanced Features

2020-01-20 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019642#comment-17019642
 ] 

Rong Rong commented on FLINK-14460:
---

Thanks [~fly_in_gis] for driving this. Really looking forward to native Flink 
on K8S. I briefly went through the document. I was wondering: should we have a 
JIRA log the work to "refine the documentation" ? One think I love to have is:
 * create an E2E "quick start" sections in the documentation? one for Job 
cluster and one for Session cluster that users can successfully start running 
without read through the remaining parts.

Also we might be able to add an advance usage section, some of the ideas I have 
are: how to create custom session cluster Flink images; what are the suggested 
add-ons we can run alongside Flink session cluster? ...

What do you think?

> Active Kubernetes integration phase2 - Advanced Features
> 
>
> Key: FLINK-14460
> URL: https://issues.apache.org/jira/browse/FLINK-14460
> Project: Flink
>  Issue Type: New Feature
>  Components: Deployment / Kubernetes
>Reporter: Yang Wang
>Priority: Major
>
> This is phase2 of active kubernetes integration. It is a umbrella jira to 
> track all the advanced features and make Flink on Kubernetes production ready.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-18 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018645#comment-17018645
 ] 

Rong Rong edited comment on FLINK-15447 at 1/18/20 5:02 PM:


Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means, potentially, whatever directory configured, all Java process will 
be sharing this directory as tmp folder.

So could you clarify which is the main concern: 
1. We don't want to pollute {{/tmp}} - which potentially will be used also by 
NON-JVM processes.
2. We want Flink JM/TM to NOT share with others JVM process or other YARN 
containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 


was (Author: rongr):
Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means, potentially, whatever directory configured, all Java process will 
be sharing this directory as tmp folder.

So could you clarify which is the main concern: 
1. The default key is set to {{/tmp}} - which potentially will be used also by 
NON-JVM process.
2. In addition, we also want Flink JM/TM to NOT share with others JVM process 
or other YARN containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-18 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018645#comment-17018645
 ] 

Rong Rong edited comment on FLINK-15447 at 1/18/20 5:01 PM:


Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means, potentially, whatever directory configured, all Java process will 
be sharing this directory as tmp folder.

So could you clarify which is the main concern: 
1. The default key is set to {{/tmp}} - which potentially will be used also by 
NON-JVM process.
2. In addition, we also want Flink JM/TM to NOT share with others JVM process 
or other YARN containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 


was (Author: rongr):
Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means, potentially, whatever directory configured via system env, all 
Java process will be sharing this location.

So could you clarify which is the main concern: 
1. The default key is set to {{/tmp}} - which potentially will be used also by 
NON-JVM process.
2. In addition, we also want Flink JM/TM to NOT share with others JVM process 
or other YARN containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-18 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018645#comment-17018645
 ] 

Rong Rong edited comment on FLINK-15447 at 1/18/20 5:01 PM:


Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means, potentially, whatever directory configured via system env, all 
Java process will be sharing this location.

So could you clarify which is the main concern: 
1. The default key is set to {{/tmp}} - which potentially will be used also by 
NON-JVM process.
2. In addition, we also want Flink JM/TM to NOT share with others JVM process 
or other YARN containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 


was (Author: rongr):
Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means potentially means, whatever directory configured via system env, 
all Java process will be sharing this location.

So could you clarify which is the main concern: 
1. The default key is set to {{/tmp}} - which potentially will be used also by 
NON-JVM process.
2. In addition, we also want Flink JM/TM to NOT share with others JVM process 
or other YARN containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-18 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018645#comment-17018645
 ] 

Rong Rong commented on FLINK-15447:
---

Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means potentially means, whatever directory configured via system env, 
all Java process will be sharing this location.

So could you clarify which is the main concern: 
1. The default key is set to {{/tmp}} - which potentially will be used also by 
NON-JVM process.
2. In addition, we also want Flink JM/TM to NOT share with others JVM process 
or other YARN containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-17 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018177#comment-17018177
 ] 

Rong Rong commented on FLINK-15447:
---

as far as I can tell, {{java.io.tmpdir}} is controlled by SystemEnv and should 
be set as a JVM launch param similar to how [~xymaqingxiang] mentioned.

Could you elaborate what does it mean by Flink-YARN default the value to 
{{/tmp}}? I am guessing you mean JVM default the value to {{/tmp}} ?
* So far the only place I can see in flink yarn code utilizing this key is: 
https://github.com/apache/flink/blob/release-1.10/flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java#L899

Is your intention to have Flink override the JVM configuration internally and 
ignore the system environment config? 

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15561) Improve Kerberos delegation token login

2020-01-14 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015545#comment-17015545
 ] 

Rong Rong commented on FLINK-15561:
---

made a simple change on issue #2 and it looks good, maybe we can verify whether 
this fix works:
https://github.com/walterddr/flink/commit/60240028bebc09e1d65328eb680a3a24108beb94

> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
>
> Currently the security HadoopModule handles delegation token login seems to 
> be not working.
> Some improvements including: spawning a delegation token renewal thread. See: 
> [1] 
> https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84
>  
> and [2] 
> https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538
> Another is to ensure delegation token is also a valid format of credential 
> when launching YARN context. See [1] 
> https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484
>  and [2] 
> https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15561) Improve Kerberos delegation token login

2020-01-12 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15561:
--
Description: 
Currently the security HadoopModule handles delegation token login seems to be 
not working.

Some improvements including: spawning a delegation token renewal thread. See: 
[1] 
https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84
 
and [2] 
https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538

Another is to ensure delegation token is also a valid format of credential when 
launching YARN context. See [1] 
https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484
 and [2] 
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146

  was:
Currently the security HadoopModule handles delegation token login without 
spawning a delegation token renewal thread. We might need to include this to 
support delegation token.

See: [1] 
https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84
 
and 
[2] 
https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538


> Improve Kerberos delegation token login 
> 
>
> Key: FLINK-15561
> URL: https://issues.apache.org/jira/browse/FLINK-15561
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> Currently the security HadoopModule handles delegation token login seems to 
> be not working.
> Some improvements including: spawning a delegation token renewal thread. See: 
> [1] 
> https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84
>  
> and [2] 
> https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538
> Another is to ensure delegation token is also a valid format of credential 
> when launching YARN context. See [1] 
> https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L484
>  and [2] 
> https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L146



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15561) Improve Kerberos delegation token login

2020-01-12 Thread Rong Rong (Jira)

Rong Rong created FLINK-15561:
-

 Summary: Improve Kerberos delegation token login 
 Key: FLINK-15561
 URL: https://issues.apache.org/jira/browse/FLINK-15561
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Reporter: Rong Rong
Assignee: Rong Rong


Currently the security HadoopModule handles delegation token login without 
spawning a delegation token renewal thread. We might need to include this to 
support delegation token.

See: [1] 
https://github.com/apache/flink/blob/release-1.9/flink-runtime/src/main/java/org/apache/flink/runtime/security/modules/HadoopModule.java#L84
 
and 
[2] 
https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/security/UserGroupInformation.java#L538



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15475) Add isOutputTypeUsed() API to Transformation

2020-01-03 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15475:
--
Description: 
Currently there's no way to "peek" into a `Transformation` object and see if 
`typeUsed` has been set or not. The only way is to invoke the `setOutputType` 
API and wrap around it with a try-catch block similar to:

{code:java}
try {
  (SingleOutputStreamOperator)dataStream
.returns(myOutputType);
} catch (ValidationException ex) {
  // ... handle exception when type has been used.
}
{code}


It would be nice if we have a `isTypeUsed()` or `isOutputTypeUsed()` API to 
check whether a particular transformation has a definitive output type set / 
used or not.



  was:
Currently there's no way to "peek" into a `Transformation` object and see if 
`typeUsed` has been set or not. The only way is to invoke the `setOutputType` 
API and wrap around it with a try-catch block. See: 

It would be nice if we have a `isTypeUsed()` or `isOutputTypeUsed()` API to 
check whether a particular transformation has a definitive output type set / 
used or not.




> Add isOutputTypeUsed() API to Transformation
> 
>
> Key: FLINK-15475
> URL: https://issues.apache.org/jira/browse/FLINK-15475
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core, API / DataSet, API / DataStream
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Minor
>
> Currently there's no way to "peek" into a `Transformation` object and see if 
> `typeUsed` has been set or not. The only way is to invoke the `setOutputType` 
> API and wrap around it with a try-catch block similar to:
> {code:java}
> try {
>   (SingleOutputStreamOperator)dataStream
> .returns(myOutputType);
> } catch (ValidationException ex) {
>   // ... handle exception when type has been used.
> }
> {code}
> It would be nice if we have a `isTypeUsed()` or `isOutputTypeUsed()` API to 
> check whether a particular transformation has a definitive output type set / 
> used or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15475) Add isOutputTypeUsed() API to Transformation

2020-01-03 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15475:
--
Description: 
Currently there's no way to "peek" into a `Transformation` object and see if 
`typeUsed` has been set or not. The only way is to invoke the `setOutputType` 
API and wrap around it with a try-catch block. 

It would be nice if we have a `isTypeUsed()` or `isOutputTypeUsed()` API to 
check whether a particular transformation has a definitive output type set / 
used or not.



  was:
Currently there's no way to "peek" into a Transformation and see if OutputType 
has been used or not. The only way is to invoke the {{setOutputType}} API and 
wrap around it with a try-catch block. 

It would be nice if we have a `isTypeUsed()` or `isOutputTypeUsed()` API to 
check whether a particular transformation has a definitive output type set / 
used or not.




> Add isOutputTypeUsed() API to Transformation
> 
>
> Key: FLINK-15475
> URL: https://issues.apache.org/jira/browse/FLINK-15475
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core, API / DataSet, API / DataStream
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Minor
>
> Currently there's no way to "peek" into a `Transformation` object and see if 
> `typeUsed` has been set or not. The only way is to invoke the `setOutputType` 
> API and wrap around it with a try-catch block. 
> It would be nice if we have a `isTypeUsed()` or `isOutputTypeUsed()` API to 
> check whether a particular transformation has a definitive output type set / 
> used or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-15475) Add isOutputTypeUsed() API to Transformation

2020-01-03 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-15475:
--
Description: 
Currently there's no way to "peek" into a `Transformation` object and see if 
`typeUsed` has been set or not. The only way is to invoke the `setOutputType` 
API and wrap around it with a try-catch block. See: 

It would be nice if we have a `isTypeUsed()` or `isOutputTypeUsed()` API to 
check whether a particular transformation has a definitive output type set / 
used or not.



  was:
Currently there's no way to "peek" into a `Transformation` object and see if 
`typeUsed` has been set or not. The only way is to invoke the `setOutputType` 
API and wrap around it with a try-catch block. 

It would be nice if we have a `isTypeUsed()` or `isOutputTypeUsed()` API to 
check whether a particular transformation has a definitive output type set / 
used or not.




> Add isOutputTypeUsed() API to Transformation
> 
>
> Key: FLINK-15475
> URL: https://issues.apache.org/jira/browse/FLINK-15475
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core, API / DataSet, API / DataStream
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Minor
>
> Currently there's no way to "peek" into a `Transformation` object and see if 
> `typeUsed` has been set or not. The only way is to invoke the `setOutputType` 
> API and wrap around it with a try-catch block. See: 
> It would be nice if we have a `isTypeUsed()` or `isOutputTypeUsed()` API to 
> check whether a particular transformation has a definitive output type set / 
> used or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15475) Add isOutputTypeUsed() API to Transformation

2020-01-03 Thread Rong Rong (Jira)

Rong Rong created FLINK-15475:
-

 Summary: Add isOutputTypeUsed() API to Transformation
 Key: FLINK-15475
 URL: https://issues.apache.org/jira/browse/FLINK-15475
 Project: Flink
  Issue Type: Improvement
  Components: API / Core, API / DataSet, API / DataStream
Reporter: Rong Rong
Assignee: Rong Rong


Currently there's no way to "peek" into a Transformation and see if OutputType 
has been used or not. The only way is to invoke the {{setOutputType}} API and 
wrap around it with a try-catch block. 

It would be nice if we have a `isTypeUsed()` or `isOutputTypeUsed()` API to 
check whether a particular transformation has a definitive output type set / 
used or not.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-11120) The bug of timestampadd handles time

2020-01-03 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-11120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17007605#comment-17007605
 ] 

Rong Rong commented on FLINK-11120:
---

Hi [~x1q1j1] yes it will be in 1.10 release and it has already been backported 
to 1.9.x as well. 
Thanks [~jark] for fixing the JIRA status :-)

> The bug of timestampadd  handles time
> -
>
> Key: FLINK-11120
> URL: https://issues.apache.org/jira/browse/FLINK-11120
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Forward Xu
>Assignee: Forward Xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.2, 1.10.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The error occur when {{timestampadd(MINUTE, 1, time '01:00:00')}} is executed:
> java.lang.ClassCastException: java.lang.Integer cannot be cast to 
> java.lang.Long
> at org.apache.calcite.rex.RexBuilder.clean(RexBuilder.java:1520)
> at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:1318)
> at 
> org.apache.flink.table.codegen.ExpressionReducer.reduce(ExpressionReducer.scala:135)
> at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressionsInternal(ReduceExpressionsRule.java:620)
> at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressions(ReduceExpressionsRule.java:540)
> at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:288)
> I think it should meet the following conditions:
> ||expression||Expect the result||
> |timestampadd(MINUTE, -1, time '00:00:00')|23:59:00|
> |timestampadd(MINUTE, 1, time '00:00:00')|00:01:00|
> |timestampadd(MINUTE, 1, time '23:59:59')|00:00:59|
> |timestampadd(SECOND, 1, time '23:59:59')|00:00:00|
> |timestampadd(HOUR, 1, time '23:59:59')|00:59:59|
> This problem seems to be a bug in calcite. I have submitted isuse to calcite. 
> The following is the link.
> CALCITE-2699



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-15451) TaskManagerProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure failed on azure

2019-12-31 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong reassigned FLINK-15451:
-

Assignee: (was: Rong Rong)

> TaskManagerProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure 
> failed on azure
> --
>
> Key: FLINK-15451
> URL: https://issues.apache.org/jira/browse/FLINK-15451
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.9.1
>Reporter: Congxian Qiu(klion26)
>Priority: Major
>
> 2019-12-31T02:43:39.4766254Z [ERROR] Tests run: 2, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 42.801 s <<< FAILURE! - in 
> org.apache.flink.test.recovery.TaskManagerProcessFailureBatchRecoveryITCase 
> 2019-12-31T02:43:39.4768373Z [ERROR] 
> testTaskManagerProcessFailure[0](org.apache.flink.test.recovery.TaskManagerProcessFailureBatchRecoveryITCase)
>  Time elapsed: 2.699 s <<< ERROR! 2019-12-31T02:43:39.4768834Z 
> java.net.BindException: Address already in use 2019-12-31T02:43:39.4769096Z
>  
>  
> [https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_apis/build/builds/3995/logs/15]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-15451) TaskManagerProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure failed on azure

2019-12-31 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong reassigned FLINK-15451:
-

Assignee: Rong Rong

> TaskManagerProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure 
> failed on azure
> --
>
> Key: FLINK-15451
> URL: https://issues.apache.org/jira/browse/FLINK-15451
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.9.1
>Reporter: Congxian Qiu(klion26)
>Assignee: Rong Rong
>Priority: Major
>
> 2019-12-31T02:43:39.4766254Z [ERROR] Tests run: 2, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 42.801 s <<< FAILURE! - in 
> org.apache.flink.test.recovery.TaskManagerProcessFailureBatchRecoveryITCase 
> 2019-12-31T02:43:39.4768373Z [ERROR] 
> testTaskManagerProcessFailure[0](org.apache.flink.test.recovery.TaskManagerProcessFailureBatchRecoveryITCase)
>  Time elapsed: 2.699 s <<< ERROR! 2019-12-31T02:43:39.4768834Z 
> java.net.BindException: Address already in use 2019-12-31T02:43:39.4769096Z
>  
>  
> [https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_apis/build/builds/3995/logs/15]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-9630) Kafka09PartitionDiscoverer cause connection leak on TopicAuthorizationException

2019-12-16 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-9630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16997757#comment-16997757
 ] 

Rong Rong commented on FLINK-9630:
--

Hi All,

is this connected somehow with FLINK-8497? We are investigating some similar 
issues with the PartitionDiscovery. 
Also I think based on the bug report. This doesn't seem to affect Kafka 0.11 
and up, yes ?

--
Rong

> Kafka09PartitionDiscoverer cause connection leak on 
> TopicAuthorizationException
> ---
>
> Key: FLINK-9630
> URL: https://issues.apache.org/jira/browse/FLINK-9630
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.4.2, 1.5.0
> Environment: Linux 2.6, java 8, Kafka broker 0.10.x
>Reporter: Youjun Yuan
>Priority: Major
>  Labels: pull-request-available
>
> when the Kafka topic got deleted, during task starting process, 
> Kafka09PartitionDiscoverer will get a *TopicAuthorizationException* in 
> getAllPartitionsForTopics(), and it get no chance to close the  
> kafkaConsumer, hence resulting TCP connection leak (to Kafka broker).
>  
> *this issue can bring down the whole Flink cluster*, because, in a default 
> setup (fixedDelay with INT.MAX restart attempt), job manager will randomly 
> schedule the job to any TaskManager that has free slot, and each attemp will 
> cause the TaskManager to leak a TCP connection, eventually almost every 
> TaskManager will run out of file handle, hence no taskmanger could make 
> snapshot, or accept new job. Effectly stops the whole cluster.
>  
> The leak happens when StreamTask.invoke() calls openAllOperators(), then 
> FlinkKafkaConsumerBase.open() calls partitionDiscoverer.discoverPartitions(), 
> when kafkaConsumer.partitionsFor(topic) in 
> KafkaPartitionDiscoverer.getAllPartitionsForTopics() hit a 
> *TopicAuthorizationException,* no one catches this.
> Though StreamTask.open catches Exception and invoks the dispose() method of 
> each operator, which eventaully invoke FlinkKakfaConsumerBase.cancel(), 
> however it does not close the kakfaConsumer in partitionDiscoverer, not even 
> invoke the partitionDiscoverer.wakeup(), because the discoveryLoopThread was 
> null.
>  
> below is the code of FlinkKakfaConsumerBase.cancel() for your convenience
> public void cancel() {
>      // set ourselves as not running;
>      // this would let the main discovery loop escape as soon as possible
>      running = false;
>     if (discoveryLoopThread != null) {
>         if (partitionDiscoverer != null)
> {             // we cannot close the discoverer here, as it is error-prone to 
> concurrent access;             // only wakeup the discoverer, the discovery 
> loop will clean itself up after it escapes             
> partitionDiscoverer.wakeup();         }
>     // the discovery loop may currently be sleeping in-between
>      // consecutive discoveries; interrupt to shutdown faster
>      discoveryLoopThread.interrupt();
>      }
>     // abort the fetcher, if there is one
>      if (kafkaFetcher != null)
> {          kafkaFetcher.cancel();     }
> }
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-9630) Kafka09PartitionDiscoverer cause connection leak on TopicAuthorizationException

2019-12-16 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-9630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16997755#comment-16997755
 ] 

Rong Rong commented on FLINK-9630:
--

Hi All,

Is this a duplicate of FLINK-8497 ?

> Kafka09PartitionDiscoverer cause connection leak on 
> TopicAuthorizationException
> ---
>
> Key: FLINK-9630
> URL: https://issues.apache.org/jira/browse/FLINK-9630
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.4.2, 1.5.0
> Environment: Linux 2.6, java 8, Kafka broker 0.10.x
>Reporter: Youjun Yuan
>Priority: Major
>  Labels: pull-request-available
>
> when the Kafka topic got deleted, during task starting process, 
> Kafka09PartitionDiscoverer will get a *TopicAuthorizationException* in 
> getAllPartitionsForTopics(), and it get no chance to close the  
> kafkaConsumer, hence resulting TCP connection leak (to Kafka broker).
>  
> *this issue can bring down the whole Flink cluster*, because, in a default 
> setup (fixedDelay with INT.MAX restart attempt), job manager will randomly 
> schedule the job to any TaskManager that has free slot, and each attemp will 
> cause the TaskManager to leak a TCP connection, eventually almost every 
> TaskManager will run out of file handle, hence no taskmanger could make 
> snapshot, or accept new job. Effectly stops the whole cluster.
>  
> The leak happens when StreamTask.invoke() calls openAllOperators(), then 
> FlinkKafkaConsumerBase.open() calls partitionDiscoverer.discoverPartitions(), 
> when kafkaConsumer.partitionsFor(topic) in 
> KafkaPartitionDiscoverer.getAllPartitionsForTopics() hit a 
> *TopicAuthorizationException,* no one catches this.
> Though StreamTask.open catches Exception and invoks the dispose() method of 
> each operator, which eventaully invoke FlinkKakfaConsumerBase.cancel(), 
> however it does not close the kakfaConsumer in partitionDiscoverer, not even 
> invoke the partitionDiscoverer.wakeup(), because the discoveryLoopThread was 
> null.
>  
> below is the code of FlinkKakfaConsumerBase.cancel() for your convenience
> public void cancel() {
>      // set ourselves as not running;
>      // this would let the main discovery loop escape as soon as possible
>      running = false;
>     if (discoveryLoopThread != null) {
>         if (partitionDiscoverer != null)
> {             // we cannot close the discoverer here, as it is error-prone to 
> concurrent access;             // only wakeup the discoverer, the discovery 
> loop will clean itself up after it escapes             
> partitionDiscoverer.wakeup();         }
>     // the discovery loop may currently be sleeping in-between
>      // consecutive discoveries; interrupt to shutdown faster
>      discoveryLoopThread.interrupt();
>      }
>     // abort the fetcher, if there is one
>      if (kafkaFetcher != null)
> {          kafkaFetcher.cancel();     }
> }
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Issue Comment Deleted] (FLINK-9630) Kafka09PartitionDiscoverer cause connection leak on TopicAuthorizationException

2019-12-16 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-9630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-9630:
-
Comment: was deleted

(was: Hi All,

Is this a duplicate of FLINK-8497 ?)

> Kafka09PartitionDiscoverer cause connection leak on 
> TopicAuthorizationException
> ---
>
> Key: FLINK-9630
> URL: https://issues.apache.org/jira/browse/FLINK-9630
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.4.2, 1.5.0
> Environment: Linux 2.6, java 8, Kafka broker 0.10.x
>Reporter: Youjun Yuan
>Priority: Major
>  Labels: pull-request-available
>
> when the Kafka topic got deleted, during task starting process, 
> Kafka09PartitionDiscoverer will get a *TopicAuthorizationException* in 
> getAllPartitionsForTopics(), and it get no chance to close the  
> kafkaConsumer, hence resulting TCP connection leak (to Kafka broker).
>  
> *this issue can bring down the whole Flink cluster*, because, in a default 
> setup (fixedDelay with INT.MAX restart attempt), job manager will randomly 
> schedule the job to any TaskManager that has free slot, and each attemp will 
> cause the TaskManager to leak a TCP connection, eventually almost every 
> TaskManager will run out of file handle, hence no taskmanger could make 
> snapshot, or accept new job. Effectly stops the whole cluster.
>  
> The leak happens when StreamTask.invoke() calls openAllOperators(), then 
> FlinkKafkaConsumerBase.open() calls partitionDiscoverer.discoverPartitions(), 
> when kafkaConsumer.partitionsFor(topic) in 
> KafkaPartitionDiscoverer.getAllPartitionsForTopics() hit a 
> *TopicAuthorizationException,* no one catches this.
> Though StreamTask.open catches Exception and invoks the dispose() method of 
> each operator, which eventaully invoke FlinkKakfaConsumerBase.cancel(), 
> however it does not close the kakfaConsumer in partitionDiscoverer, not even 
> invoke the partitionDiscoverer.wakeup(), because the discoveryLoopThread was 
> null.
>  
> below is the code of FlinkKakfaConsumerBase.cancel() for your convenience
> public void cancel() {
>      // set ourselves as not running;
>      // this would let the main discovery loop escape as soon as possible
>      running = false;
>     if (discoveryLoopThread != null) {
>         if (partitionDiscoverer != null)
> {             // we cannot close the discoverer here, as it is error-prone to 
> concurrent access;             // only wakeup the discoverer, the discovery 
> loop will clean itself up after it escapes             
> partitionDiscoverer.wakeup();         }
>     // the discovery loop may currently be sleeping in-between
>      // consecutive discoveries; interrupt to shutdown faster
>      discoveryLoopThread.interrupt();
>      }
>     // abort the fetcher, if there is one
>      if (kafkaFetcher != null)
> {          kafkaFetcher.cancel();     }
> }
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-11120) The bug of timestampadd handles time

2019-12-07 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-11120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong closed FLINK-11120.
-
Resolution: Fixed

> The bug of timestampadd  handles time
> -
>
> Key: FLINK-11120
> URL: https://issues.apache.org/jira/browse/FLINK-11120
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Forward Xu
>Assignee: Forward Xu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The error occur when {{timestampadd(MINUTE, 1, time '01:00:00')}} is executed:
> java.lang.ClassCastException: java.lang.Integer cannot be cast to 
> java.lang.Long
> at org.apache.calcite.rex.RexBuilder.clean(RexBuilder.java:1520)
> at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:1318)
> at 
> org.apache.flink.table.codegen.ExpressionReducer.reduce(ExpressionReducer.scala:135)
> at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressionsInternal(ReduceExpressionsRule.java:620)
> at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressions(ReduceExpressionsRule.java:540)
> at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:288)
> I think it should meet the following conditions:
> ||expression||Expect the result||
> |timestampadd(MINUTE, -1, time '00:00:00')|23:59:00|
> |timestampadd(MINUTE, 1, time '00:00:00')|00:01:00|
> |timestampadd(MINUTE, 1, time '23:59:59')|00:00:59|
> |timestampadd(SECOND, 1, time '23:59:59')|00:00:00|
> |timestampadd(HOUR, 1, time '23:59:59')|00:59:59|
> This problem seems to be a bug in calcite. I have submitted isuse to calcite. 
> The following is the link.
> CALCITE-2699



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-11120) The bug of timestampadd handles time

2019-12-07 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-11120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990570#comment-16990570
 ] 

Rong Rong commented on FLINK-11120:
---

closed via: 
master: (9320f344f37019b9b389ef06477c8049e9fa3218, 
f5dafd3dda029e1529f8ba823777aec627707b97)
release-1.9: (86d518395b8e2236047181ad940a77a6c4d57ecb, 
3b420f6346b303dc47d01afa3afd50869a182e3f)


> The bug of timestampadd  handles time
> -
>
> Key: FLINK-11120
> URL: https://issues.apache.org/jira/browse/FLINK-11120
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Reporter: Forward Xu
>Assignee: Forward Xu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The error occur when {{timestampadd(MINUTE, 1, time '01:00:00')}} is executed:
> java.lang.ClassCastException: java.lang.Integer cannot be cast to 
> java.lang.Long
> at org.apache.calcite.rex.RexBuilder.clean(RexBuilder.java:1520)
> at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:1318)
> at 
> org.apache.flink.table.codegen.ExpressionReducer.reduce(ExpressionReducer.scala:135)
> at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressionsInternal(ReduceExpressionsRule.java:620)
> at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressions(ReduceExpressionsRule.java:540)
> at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:288)
> I think it should meet the following conditions:
> ||expression||Expect the result||
> |timestampadd(MINUTE, -1, time '00:00:00')|23:59:00|
> |timestampadd(MINUTE, 1, time '00:00:00')|00:01:00|
> |timestampadd(MINUTE, 1, time '23:59:59')|00:00:59|
> |timestampadd(SECOND, 1, time '23:59:59')|00:00:00|
> |timestampadd(HOUR, 1, time '23:59:59')|00:59:59|
> This problem seems to be a bug in calcite. I have submitted isuse to calcite. 
> The following is the link.
> CALCITE-2699



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14729) Multi-topics consuming from KafkaTableSource

2019-11-29 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985127#comment-16985127
 ] 

Rong Rong commented on FLINK-14729:
---

Hi [~fangpengcheng95] [~50man] I think it would be nice if you can share more 
of the motivation/problem statement for why supporting consumption from 
multiple topics as the underlying data source for ONE table.


One of the reason I can think of is the support for handling something similar 
to [Kafka 
DLQ|https://www.confluent.io/blog/kafka-connect-deep-dive-error-handling-dead-letter-queues/]
 - is this want you are trying to support? 

> Multi-topics consuming from KafkaTableSource
> 
>
> Key: FLINK-14729
> URL: https://issues.apache.org/jira/browse/FLINK-14729
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Kafka
>Reporter: Leo Zhang
>Priority: Major
>  Labels: features, pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hi, all. I propose a new functionality of KafkaTableSource which can consume 
> multiple topics at the same time. 
> *Design plan*
>  * Add a new constructor in KafkaTableSource which accepts topics with List 
> type as one parameter.
>  * Modify the existed one which only accepts one topic as string type to call 
> the proposed one to finish the instantiation. That is to say, wrap this topic 
> in a list and pass it to the multi-topics-consuming constructor.
>  * Modify the overridden method createKafkaConsumer in KafkaTableSource to 
> pass topics as List instead of String.
>  * Replace the field topic with topics as List type in  KafkaTableSourceBase 
> and modify every place using topic with topics. So we just need to modify the 
> constructor KafkaTableSourceBase, method getDataStream, and equals and 
> hashCode.
> *Test plan*
> There is less to do as KafkaTableSource is based on FlinkKafkaConsumer which 
> already supports consuming multiple topics and is tested well. Of course, we 
> can easily add further more tests if needed.
>  
> So what's your opinion?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14153) Add to BLAS a method that performs DenseMatrix and SparseVector multiplication.

2019-10-25 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960191#comment-16960191
 ] 

Rong Rong commented on FLINK-14153:
---

Committed via: 23a34b818d8267d16f6c22c77b79c67387da5e88

> Add to BLAS a method that performs DenseMatrix and SparseVector 
> multiplication.
> ---
>
> Key: FLINK-14153
> URL: https://issues.apache.org/jira/browse/FLINK-14153
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Machine Learning
>Reporter: Xu Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Previously there is "gemv" method in BLAS that performs multiplications 
> between DenseMatrix and DenseVector. Here we add another one that performs 
> multiplications between DenseMatrix and SparseVector.
>  * Add gemv method to BLAS.
>  * Add BLASTest that provides test cases for BLAS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-14153) Add to BLAS a method that performs DenseMatrix and SparseVector multiplication.

2019-10-25 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960191#comment-16960191
 ] 

Rong Rong edited comment on FLINK-14153 at 10/25/19 10:43 PM:
--

merged to 1.10: 23a34b818d8267d16f6c22c77b79c67387da5e88


was (Author: rongr):
Committed via: 23a34b818d8267d16f6c22c77b79c67387da5e88

> Add to BLAS a method that performs DenseMatrix and SparseVector 
> multiplication.
> ---
>
> Key: FLINK-14153
> URL: https://issues.apache.org/jira/browse/FLINK-14153
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Machine Learning
>Reporter: Xu Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Previously there is "gemv" method in BLAS that performs multiplications 
> between DenseMatrix and DenseVector. Here we add another one that performs 
> multiplications between DenseMatrix and SparseVector.
>  * Add gemv method to BLAS.
>  * Add BLASTest that provides test cases for BLAS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (FLINK-14153) Add to BLAS a method that performs DenseMatrix and SparseVector multiplication.

2019-10-25 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong resolved FLINK-14153.
---
Resolution: Fixed

> Add to BLAS a method that performs DenseMatrix and SparseVector 
> multiplication.
> ---
>
> Key: FLINK-14153
> URL: https://issues.apache.org/jira/browse/FLINK-14153
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Machine Learning
>Reporter: Xu Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Previously there is "gemv" method in BLAS that performs multiplications 
> between DenseMatrix and DenseVector. Here we add another one that performs 
> multiplications between DenseMatrix and SparseVector.
>  * Add gemv method to BLAS.
>  * Add BLASTest that provides test cases for BLAS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-14153) Add to BLAS a method that performs DenseMatrix and SparseVector multiplication.

2019-10-25 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-14153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-14153:
--
Fix Version/s: 1.10.0

> Add to BLAS a method that performs DenseMatrix and SparseVector 
> multiplication.
> ---
>
> Key: FLINK-14153
> URL: https://issues.apache.org/jira/browse/FLINK-14153
> Project: Flink
>  Issue Type: Sub-task
>  Components: Library / Machine Learning
>Reporter: Xu Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Previously there is "gemv" method in BLAS that performs multiplications 
> between DenseMatrix and DenseVector. Here we add another one that performs 
> multiplications between DenseMatrix and SparseVector.
>  * Add gemv method to BLAS.
>  * Add BLASTest that provides test cases for BLAS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-12399) FilterableTableSource does not use filters on job run

2019-10-19 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955236#comment-16955236
 ] 

Rong Rong commented on FLINK-12399:
---

merged to 1.9: c1019105c22455c554ab91b9fc2ef8512873bee8

> FilterableTableSource does not use filters on job run
> -
>
> Key: FLINK-12399
> URL: https://issues.apache.org/jira/browse/FLINK-12399
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.8.0
>Reporter: Josh Bradt
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
> Attachments: flink-filter-bug.tar.gz
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As discussed [on the mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Filter-push-down-not-working-for-a-custom-BatchTableSource-tp27654.html],
>  there appears to be a bug where a job that uses a custom 
> FilterableTableSource does not keep the filters that were pushed down into 
> the table source. More specifically, the table source does receive filters 
> via applyPredicates, and a new table source with those filters is returned, 
> but the final job graph appears to use the original table source, which does 
> not contain any filters.
> I attached a minimal example program to this ticket. The custom table source 
> is as follows: 
> {code:java}
> public class CustomTableSource implements BatchTableSource, 
> FilterableTableSource {
> private static final Logger LOG = 
> LoggerFactory.getLogger(CustomTableSource.class);
> private final Filter[] filters;
> private final FilterConverter converter = new FilterConverter();
> public CustomTableSource() {
> this(null);
> }
> private CustomTableSource(Filter[] filters) {
> this.filters = filters;
> }
> @Override
> public DataSet getDataSet(ExecutionEnvironment execEnv) {
> if (filters == null) {
>LOG.info(" No filters defined ");
> } else {
> LOG.info(" Found filters ");
> for (Filter filter : filters) {
> LOG.info("FILTER: {}", filter);
> }
> }
> return execEnv.fromCollection(allModels());
> }
> @Override
> public TableSource applyPredicate(List predicates) {
> LOG.info("Applying predicates");
> List acceptedFilters = new ArrayList<>();
> for (final Expression predicate : predicates) {
> converter.convert(predicate).ifPresent(acceptedFilters::add);
> }
> return new CustomTableSource(acceptedFilters.toArray(new Filter[0]));
> }
> @Override
> public boolean isFilterPushedDown() {
> return filters != null;
> }
> @Override
> public TypeInformation getReturnType() {
> return TypeInformation.of(Model.class);
> }
> @Override
> public TableSchema getTableSchema() {
> return TableSchema.fromTypeInfo(getReturnType());
> }
> private List allModels() {
> List models = new ArrayList<>();
> models.add(new Model(1, 2, 3, 4));
> models.add(new Model(10, 11, 12, 13));
> models.add(new Model(20, 21, 22, 23));
> return models;
> }
> }
> {code}
>  
> When run, it logs
> {noformat}
> 15:24:54,888 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,901 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,910 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,977 INFO  com.klaviyo.filterbug.CustomTableSource
>-  No filters defined {noformat}
> which appears to indicate that although filters are getting pushed down, the 
> final job does not use them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-12399) FilterableTableSource does not use filters on job run

2019-10-19 Thread Rong Rong (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-12399:
--
Fix Version/s: 1.9.2

> FilterableTableSource does not use filters on job run
> -
>
> Key: FLINK-12399
> URL: https://issues.apache.org/jira/browse/FLINK-12399
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.8.0
>Reporter: Josh Bradt
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0, 1.9.2
>
> Attachments: flink-filter-bug.tar.gz
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As discussed [on the mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Filter-push-down-not-working-for-a-custom-BatchTableSource-tp27654.html],
>  there appears to be a bug where a job that uses a custom 
> FilterableTableSource does not keep the filters that were pushed down into 
> the table source. More specifically, the table source does receive filters 
> via applyPredicates, and a new table source with those filters is returned, 
> but the final job graph appears to use the original table source, which does 
> not contain any filters.
> I attached a minimal example program to this ticket. The custom table source 
> is as follows: 
> {code:java}
> public class CustomTableSource implements BatchTableSource, 
> FilterableTableSource {
> private static final Logger LOG = 
> LoggerFactory.getLogger(CustomTableSource.class);
> private final Filter[] filters;
> private final FilterConverter converter = new FilterConverter();
> public CustomTableSource() {
> this(null);
> }
> private CustomTableSource(Filter[] filters) {
> this.filters = filters;
> }
> @Override
> public DataSet getDataSet(ExecutionEnvironment execEnv) {
> if (filters == null) {
>LOG.info(" No filters defined ");
> } else {
> LOG.info(" Found filters ");
> for (Filter filter : filters) {
> LOG.info("FILTER: {}", filter);
> }
> }
> return execEnv.fromCollection(allModels());
> }
> @Override
> public TableSource applyPredicate(List predicates) {
> LOG.info("Applying predicates");
> List acceptedFilters = new ArrayList<>();
> for (final Expression predicate : predicates) {
> converter.convert(predicate).ifPresent(acceptedFilters::add);
> }
> return new CustomTableSource(acceptedFilters.toArray(new Filter[0]));
> }
> @Override
> public boolean isFilterPushedDown() {
> return filters != null;
> }
> @Override
> public TypeInformation getReturnType() {
> return TypeInformation.of(Model.class);
> }
> @Override
> public TableSchema getTableSchema() {
> return TableSchema.fromTypeInfo(getReturnType());
> }
> private List allModels() {
> List models = new ArrayList<>();
> models.add(new Model(1, 2, 3, 4));
> models.add(new Model(10, 11, 12, 13));
> models.add(new Model(20, 21, 22, 23));
> return models;
> }
> }
> {code}
>  
> When run, it logs
> {noformat}
> 15:24:54,888 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,901 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,910 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,977 INFO  com.klaviyo.filterbug.CustomTableSource
>-  No filters defined {noformat}
> which appears to indicate that although filters are getting pushed down, the 
> final job does not use them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-14442) Add time based interval execution to JDBC connectors.

2019-10-18 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954803#comment-16954803
 ] 

Rong Rong commented on FLINK-14442:
---

Hi [~javadevmtl] sorry I misunderstood your intention in the mailing list. I 
thought you need to support some sort of customized combination flushing 
strategy that supports both batch size and interval.

yes. [~jark]  is right. I think the async flushing should achieved your goal if 
you would like a flush, for example exactly every 2sec, no matter how many msgs 
are inside the buffer (as long as it does not hit the batchInterval limit). 

> Add time based interval execution to JDBC connectors.
> -
>
> Key: FLINK-14442
> URL: https://issues.apache.org/jira/browse/FLINK-14442
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / JDBC
>Affects Versions: 1.8.0, 1.8.1, 1.8.2, 1.9.0
>Reporter: None none
>Priority: Minor
>
> Hi, currently the JDBC sink/output only supports batch interval execution. 
> For data to be streamed/committed to the JDBC database we need to wait for 
> the batch interval to be filled up.
> For example if you set a batch interval of 100 but only get 99 records then 
> no data will be committed to the database.
> The JDBC driver should maybe also have a time based interval so that data is 
> eventually pushed to the database.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-11936) Remove AuxiliaryConverter pull-in from calcite and fix auxiliary converter issue.

2019-09-18 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-11936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932569#comment-16932569
 ] 

Rong Rong commented on FLINK-11936:
---

[~danny0405] yes. I think that would be great.

> Remove AuxiliaryConverter pull-in from calcite and fix auxiliary converter 
> issue.
> -
>
> Key: FLINK-11936
> URL: https://issues.apache.org/jira/browse/FLINK-11936
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> AuxiliaryConverter was pulled in FLINK-6409. Since CALCITE-1761 has been 
> fixed, we should sync back with the calcite version.
> After a quick glance, I think it is not so simple to just delete the class so 
> I opened a follow up Jira on this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-12399) FilterableTableSource does not use filters on job run

2019-09-04 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16833107#comment-16833107
 ] 

Rong Rong edited comment on FLINK-12399 at 9/4/19 4:28 PM:
---

Hi [~josh.bradt]. I think I found the root cause of this issue.

Apparently you have to override the method {{explainSource}} in order to let 
calcite know that the new created TableSource with filter pushedDown is 
different from the original created CustomeTableSource (where you have not 
applyPredicates).
I think this might be related to the #4 changelog point 
https://github.com/apache/flink/pull/8324 when I try upgrading to CALCITE 
1.19.0 I also encounter some weird issues where calcite tries to find the 
correct tablesource from the digest strings. 

I will assigned to myself and start looking into this issue. Please let me know 
if adding the override resolves your issue at this moment.


was (Author: walterddr):
Hi [~josh.bradt]. I think I found the root cause of this issue.

Apparently you have to override the method {{explainSource}} in order to let 
calcite know that the new created TableSource with filter pushedDown is 
different from the original created CustomeTableSource (where you have not 
applyPredicates).
I think this might be related to the #4 changelog point 
https://github.com/apache/flink/pull/8324: when I try upgrading to CALCITE 
1.19.0 I also encounter some weird issues where calcite tries to find the 
correct tablesource from the digest strings. 

I will assigned to myself and start looking into this issue. Please let me know 
if adding the override resolves your issue at this moment.

> FilterableTableSource does not use filters on job run
> -
>
> Key: FLINK-12399
> URL: https://issues.apache.org/jira/browse/FLINK-12399
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.8.0
>Reporter: Josh Bradt
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Attachments: flink-filter-bug.tar.gz
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As discussed [on the mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Filter-push-down-not-working-for-a-custom-BatchTableSource-tp27654.html],
>  there appears to be a bug where a job that uses a custom 
> FilterableTableSource does not keep the filters that were pushed down into 
> the table source. More specifically, the table source does receive filters 
> via applyPredicates, and a new table source with those filters is returned, 
> but the final job graph appears to use the original table source, which does 
> not contain any filters.
> I attached a minimal example program to this ticket. The custom table source 
> is as follows: 
> {code:java}
> public class CustomTableSource implements BatchTableSource, 
> FilterableTableSource {
> private static final Logger LOG = 
> LoggerFactory.getLogger(CustomTableSource.class);
> private final Filter[] filters;
> private final FilterConverter converter = new FilterConverter();
> public CustomTableSource() {
> this(null);
> }
> private CustomTableSource(Filter[] filters) {
> this.filters = filters;
> }
> @Override
> public DataSet getDataSet(ExecutionEnvironment execEnv) {
> if (filters == null) {
>LOG.info(" No filters defined ");
> } else {
> LOG.info(" Found filters ");
> for (Filter filter : filters) {
> LOG.info("FILTER: {}", filter);
> }
> }
> return execEnv.fromCollection(allModels());
> }
> @Override
> public TableSource applyPredicate(List predicates) {
> LOG.info("Applying predicates");
> List acceptedFilters = new ArrayList<>();
> for (final Expression predicate : predicates) {
> converter.convert(predicate).ifPresent(acceptedFilters::add);
> }
> return new CustomTableSource(acceptedFilters.toArray(new Filter[0]));
> }
> @Override
> public boolean isFilterPushedDown() {
> return filters != null;
> }
> @Override
> public TypeInformation getReturnType() {
> return TypeInformation.of(Model.class);
> }
> @Override
> public TableSchema getTableSchema() {
> return TableSchema.fromTypeInfo(getReturnType());
> }
> private List allModels() {
> List models = new ArrayList<>();
> models.add(new Model(1, 2, 3, 4));
> models.add(new Model(10, 11, 12, 13));
> models.add(new Model(20, 21, 22, 23));
> return models;
> }
> }
> {code}
>  
> When run, it logs
> {noformat}
> 15:24:54,888 INFO  com.klaviyo.filterbug.CustomTableSource

[jira] [Commented] (FLINK-12399) FilterableTableSource does not use filters on job run

2019-09-03 Thread Rong Rong (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921546#comment-16921546
 ] 

Rong Rong commented on FLINK-12399:
---

Hi [~fhueske]. would you please kindly take a look at the approach to address 
this issue? The problem has been created some problems for us and also some 
multiple threads in the mailing list.
It would be nice to address this before the next release. Much appreciated. 

> FilterableTableSource does not use filters on job run
> -
>
> Key: FLINK-12399
> URL: https://issues.apache.org/jira/browse/FLINK-12399
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.8.0
>Reporter: Josh Bradt
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Attachments: flink-filter-bug.tar.gz
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As discussed [on the mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Filter-push-down-not-working-for-a-custom-BatchTableSource-tp27654.html],
>  there appears to be a bug where a job that uses a custom 
> FilterableTableSource does not keep the filters that were pushed down into 
> the table source. More specifically, the table source does receive filters 
> via applyPredicates, and a new table source with those filters is returned, 
> but the final job graph appears to use the original table source, which does 
> not contain any filters.
> I attached a minimal example program to this ticket. The custom table source 
> is as follows: 
> {code:java}
> public class CustomTableSource implements BatchTableSource, 
> FilterableTableSource {
> private static final Logger LOG = 
> LoggerFactory.getLogger(CustomTableSource.class);
> private final Filter[] filters;
> private final FilterConverter converter = new FilterConverter();
> public CustomTableSource() {
> this(null);
> }
> private CustomTableSource(Filter[] filters) {
> this.filters = filters;
> }
> @Override
> public DataSet getDataSet(ExecutionEnvironment execEnv) {
> if (filters == null) {
>LOG.info(" No filters defined ");
> } else {
> LOG.info(" Found filters ");
> for (Filter filter : filters) {
> LOG.info("FILTER: {}", filter);
> }
> }
> return execEnv.fromCollection(allModels());
> }
> @Override
> public TableSource applyPredicate(List predicates) {
> LOG.info("Applying predicates");
> List acceptedFilters = new ArrayList<>();
> for (final Expression predicate : predicates) {
> converter.convert(predicate).ifPresent(acceptedFilters::add);
> }
> return new CustomTableSource(acceptedFilters.toArray(new Filter[0]));
> }
> @Override
> public boolean isFilterPushedDown() {
> return filters != null;
> }
> @Override
> public TypeInformation getReturnType() {
> return TypeInformation.of(Model.class);
> }
> @Override
> public TableSchema getTableSchema() {
> return TableSchema.fromTypeInfo(getReturnType());
> }
> private List allModels() {
> List models = new ArrayList<>();
> models.add(new Model(1, 2, 3, 4));
> models.add(new Model(10, 11, 12, 13));
> models.add(new Model(20, 21, 22, 23));
> return models;
> }
> }
> {code}
>  
> When run, it logs
> {noformat}
> 15:24:54,888 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,901 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,910 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,977 INFO  com.klaviyo.filterbug.CustomTableSource
>-  No filters defined {noformat}
> which appears to indicate that although filters are getting pushed down, the 
> final job does not use them.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-13548) Support priority of the Flink YARN application

2019-08-14 Thread Rong Rong (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-13548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907465#comment-16907465
 ] 

Rong Rong commented on FLINK-13548:
---

Thanks for the feedback [~till.rohrmann]. Yes I think so too. And since this is 
only a configuration key-value pair. it will simply just get ignored for older 
version. I've run through flink-yarn-test against both new and old version of 
Hadoop YARN and they all look pretty promising. I will run some more tests 
before merging it. 

> Support priority of the Flink YARN application
> --
>
> Key: FLINK-13548
> URL: https://issues.apache.org/jira/browse/FLINK-13548
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Reporter: boxiu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, Flink 1.9 does not support yarn priority submission. The default 
> priority of yarn submission jobs is based on YARN official doc.
> Based on this, we can provide a ConfigOption in YarnConfigOptions. The 
> priority value is non-negative, the bigger the number, the higher the 
> priority.  By default, we take -1. When the priority is negative, we use 
> default yarn queue priority.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (FLINK-13548) Support priority of the Flink YARN application

2019-08-11 Thread Rong Rong (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-13548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16904697#comment-16904697
 ] 

Rong Rong commented on FLINK-13548:
---

Hi [~boswell] thanks for the contribution. 
I just come to realize that the priority scheduling feature was only there in 
[YARN 2.8.x and 
up|https://hadoop.apache.org/docs/r2.8.5/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html#Setup_for_application_priority.].
 
Although we can build Flink with Hadoop 2.8.x + using [specific 
commands|https://ci.apache.org/projects/flink/flink-docs-stable/flinkDev/building.html#hadoop-versions].
 I was wondering if we should implement a feature that only works in one 
version. 

CC [~till.rohrmann] who might have better insight here. 

> Support priority of the Flink YARN application
> --
>
> Key: FLINK-13548
> URL: https://issues.apache.org/jira/browse/FLINK-13548
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Reporter: boxiu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, Flink 1.9 does not support yarn priority submission. The default 
> priority of yarn submission jobs is based on YARN official doc.
> Based on this, we can provide a ConfigOption in YarnConfigOptions. The 
> priority value is non-negative, the bigger the number, the higher the 
> priority.  By default, we take -1. When the priority is negative, we use 
> default yarn queue priority.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (FLINK-13603) Flink Table ApI not working with nested Json schema starting From 1.6.x

2019-08-08 Thread Rong Rong (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903441#comment-16903441
 ] 

Rong Rong commented on FLINK-13603:
---

to answer your questions [~jacky.du0...@gmail.com], any critical bug fixes 
should be back ported to older release branches ( at least 2 if I am not 
mistaken)

> Flink Table ApI not working with nested Json schema starting From 1.6.x
> ---
>
> Key: FLINK-13603
> URL: https://issues.apache.org/jira/browse/FLINK-13603
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.6.4, 1.7.2, 1.8.1
>Reporter: Yu Du
>Priority: Major
>  Labels: bug
> Attachments: FlinkTableBugCode, jsonSchema.json, jsonSchema2.json, 
> schema_mapping_error_screenshot .png
>
>
> starting from Flink 1.6.2 , some schema not working when have nested object .
> issue like :  Caused by: 
> org.apache.calcite.sql.validate.SqlValidatorException: Column 
> 'data.interaction.action_type' not found in table 
> Even we can see that column from Table Schema .
> And the same schema and query working on 1.5.2 , but not working for 1.6.x , 
> 1.7.x and 1.8.x
>  
> I tried to dive into the bug, and found the root cause is calcite library 
> doesn't mapping the column name with the correct Row type . 
> I checked Flink 1.6 using the same version of Calcite as Flink 1.5 .  Not 
> sure if Calcite is the root cause of this issue .
> Attached with the code sample and two issue json schemas . both examples give 
> column not found exception .
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (FLINK-13603) Flink Table ApI not working with nested Json schema starting From 1.6.x

2019-08-08 Thread Rong Rong (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903440#comment-16903440
 ] 

Rong Rong commented on FLINK-13603:
---

Based on what I saw. FLINK-12848 is labeled as improvement so I am not sure 
whether it can make it to any branch older than 1.9. My suggestion is to fix 
this and let FLINK-12848 continue as an improvement. Does anyone have any 
suggestions on this, CC [~twalthr], the original owner of FLINK-9444 ?

> Flink Table ApI not working with nested Json schema starting From 1.6.x
> ---
>
> Key: FLINK-13603
> URL: https://issues.apache.org/jira/browse/FLINK-13603
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.6.4, 1.7.2, 1.8.1
>Reporter: Yu Du
>Priority: Major
>  Labels: bug
> Attachments: FlinkTableBugCode, jsonSchema.json, jsonSchema2.json, 
> schema_mapping_error_screenshot .png
>
>
> starting from Flink 1.6.2 , some schema not working when have nested object .
> issue like :  Caused by: 
> org.apache.calcite.sql.validate.SqlValidatorException: Column 
> 'data.interaction.action_type' not found in table 
> Even we can see that column from Table Schema .
> And the same schema and query working on 1.5.2 , but not working for 1.6.x , 
> 1.7.x and 1.8.x
>  
> I tried to dive into the bug, and found the root cause is calcite library 
> doesn't mapping the column name with the correct Row type . 
> I checked Flink 1.6 using the same version of Calcite as Flink 1.5 .  Not 
> sure if Calcite is the root cause of this issue .
> Attached with the code sample and two issue json schemas . both examples give 
> column not found exception .
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (FLINK-13603) Flink Table ApI not working with nested Json schema starting From 1.6.x

2019-08-08 Thread Rong Rong (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903404#comment-16903404
 ] 

Rong Rong commented on FLINK-13603:
---

Hi [~jacky.du0...@gmail.com]. I wouldn't think they are necessarily the same. 
Does changing the {{hashCode()}} function resolve your issue? 

I just did a local change and run through the test on {{flink-core}} and it 
didn't affect any test committed with FLINK-9444. so I am assuming it is not 
necessary to change the haseCode function in that PR. This would be a much 
quicker fix (and I think easier to get this in 1.9 release) 

> Flink Table ApI not working with nested Json schema starting From 1.6.x
> ---
>
> Key: FLINK-13603
> URL: https://issues.apache.org/jira/browse/FLINK-13603
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.6.4, 1.7.2, 1.8.1
>Reporter: Yu Du
>Priority: Major
>  Labels: bug
> Attachments: FlinkTableBugCode, jsonSchema.json, jsonSchema2.json, 
> schema_mapping_error_screenshot .png
>
>
> starting from Flink 1.6.2 , some schema not working when have nested object .
> issue like :  Caused by: 
> org.apache.calcite.sql.validate.SqlValidatorException: Column 
> 'data.interaction.action_type' not found in table 
> Even we can see that column from Table Schema .
> And the same schema and query working on 1.5.2 , but not working for 1.6.x , 
> 1.7.x and 1.8.x
>  
> I tried to dive into the bug, and found the root cause is calcite library 
> doesn't mapping the column name with the correct Row type . 
> I checked Flink 1.6 using the same version of Calcite as Flink 1.5 .  Not 
> sure if Calcite is the root cause of this issue .
> Attached with the code sample and two issue json schemas . both examples give 
> column not found exception .
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

1 2 3 4 5 >

1 - 100 of 449 matches

Mail list logo