[jira] [Created] (FLINK-12716) Add an interactive shell for Python Table API
Dian Fu created FLINK-12716: --- Summary: Add an interactive shell for Python Table API Key: FLINK-12716 URL: https://issues.apache.org/jira/browse/FLINK-12716 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu We should add an interactive shell for the Python Table API. It will have the similar functionality like the Scala Shell. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12717) Add windows support for the Python shell script
Dian Fu created FLINK-12717: --- Summary: Add windows support for the Python shell script Key: FLINK-12717 URL: https://issues.apache.org/jira/browse/FLINK-12717 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu We should add a windows shell script for pyflink-gateway-server.sh. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12719) Add the catalog API for the Python Table API
Dian Fu created FLINK-12719: --- Summary: Add the catalog API for the Python Table API Key: FLINK-12719 URL: https://issues.apache.org/jira/browse/FLINK-12719 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Assignee: Dian Fu The new catalog API is almost ready. We should add the corresponding Python catalog API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12720) Add the Python Table API Sphinx docs
Dian Fu created FLINK-12720: --- Summary: Add the Python Table API Sphinx docs Key: FLINK-12720 URL: https://issues.apache.org/jira/browse/FLINK-12720 Project: Flink Issue Type: Sub-task Components: API / Python, Documentation Reporter: Dian Fu As the Python Table API is added, we should add the Python Table API Sphinx docs. This includes the following work: 1) Add scripts to build the Sphinx docs 2) Add a link in the main page to the generated doc -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12722) Adds Python Table API tutorial
Dian Fu created FLINK-12722: --- Summary: Adds Python Table API tutorial Key: FLINK-12722 URL: https://issues.apache.org/jira/browse/FLINK-12722 Project: Flink Issue Type: Sub-task Components: Documentation Reporter: Dian Fu Assignee: Dian Fu We should add a tutorial for Python Table API in the docs to help beginners of Python Table API to get a basic knowledge of how to create a simple Python Table API job. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12719) Add the Python catalog API
[ https://issues.apache.org/jira/browse/FLINK-12719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-12719: Summary: Add the Python catalog API (was: Add the catalog API for the Python Table API) > Add the Python catalog API > -- > > Key: FLINK-12719 > URL: https://issues.apache.org/jira/browse/FLINK-12719 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > > The new catalog API is almost ready. We should add the corresponding Python > catalog API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12723) Adds a wiki page about setting up a Python Table API development environment
Dian Fu created FLINK-12723: --- Summary: Adds a wiki page about setting up a Python Table API development environment Key: FLINK-12723 URL: https://issues.apache.org/jira/browse/FLINK-12723 Project: Flink Issue Type: Sub-task Reporter: Dian Fu We should add a wiki page showing how to set up a Python Table API development environment to help contributors who are interested in the Python Table API to join in easily. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12767) Support user defined connectors/format
[ https://issues.apache.org/jira/browse/FLINK-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-12767: Component/s: (was: API / Python) Table SQL / API > Support user defined connectors/format > -- > > Key: FLINK-12767 > URL: https://issues.apache.org/jira/browse/FLINK-12767 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > > Currently, only built-in connectors such as FileSystem/Kafka/ES are supported > and only built-in formats such as OldCSV/JSON/Avro/CSV/ are supported. We > should also provide a convenient way for the connectors/formats that are not > built-in supported. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12757) Improves the Python word_count example to use the descriptor API
Dian Fu created FLINK-12757: --- Summary: Improves the Python word_count example to use the descriptor API Key: FLINK-12757 URL: https://issues.apache.org/jira/browse/FLINK-12757 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Assignee: Dian Fu The aim of this ticket is to improve the word_count example: 1. Uses the from_element API to create a source table 2. Uses the descriptor API to register the sink -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12541) Add deploy a Python Flink job and session cluster on Kubernetes support.
[ https://issues.apache.org/jira/browse/FLINK-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16857303#comment-16857303 ] Dian Fu commented on FLINK-12541: - Hi [~sunjincheng121] Regarding to Kubernetes support for Python jobs, there are two things to do: 1) Support to submit Python jobs to a Kubernetes session cluster. To support this, we need to improve the REST API to make it able to submit Python Table API jobs. 2) Support to run a specific job cluster on Kubernetes. To support this, we need to improve the job specific docker image build scripts to support Python Table API jobs. Currently, I have submitted two PRs using the same Jira ticket. Does it make sense to you to focus on part 1) in this ticket and creating another ticket for part 2)? > Add deploy a Python Flink job and session cluster on Kubernetes support. > > > Key: FLINK-12541 > URL: https://issues.apache.org/jira/browse/FLINK-12541 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Runtime / REST >Affects Versions: 1.9.0 >Reporter: sunjincheng >Assignee: Dian Fu >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Add deploy a Python Flink job and session cluster on Kubernetes support. > We need to have the same deployment step as the Java job. Please see: > [https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12767) Support user defined connectors/format
[ https://issues.apache.org/jira/browse/FLINK-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-12767: Component/s: (was: Table SQL / API) API / Python > Support user defined connectors/format > -- > > Key: FLINK-12767 > URL: https://issues.apache.org/jira/browse/FLINK-12767 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > > Currently, only built-in connectors such as FileSystem/Kafka/ES are supported > and only built-in formats such as OldCSV/JSON/Avro/CSV/ are supported. We > should also provide a convenient way for the connectors/formats that are not > built-in supported. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-12541) Add deploy a Python Flink job and session cluster on Kubernetes support.
[ https://issues.apache.org/jira/browse/FLINK-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16857303#comment-16857303 ] Dian Fu edited comment on FLINK-12541 at 6/6/19 5:32 AM: - Hi [~sunjincheng121] Regarding to Kubernetes support for Python jobs, there are two things to do: 1) Support to submit Python jobs to a Kubernetes session cluster. To support this, we need to improve the REST API to make it able to submit Python Table API jobs. 2) Support to run a job-specific cluster on Kubernetes. To support this, we need to improve the job specific docker image build scripts to support Python Table API jobs. Currently, I have submitted two PRs using the same Jira ticket. Does it make sense to you to focus on part 1) in this ticket and creating another ticket for part 2)? was (Author: dian.fu): Hi [~sunjincheng121] Regarding to Kubernetes support for Python jobs, there are two things to do: 1) Support to submit Python jobs to a Kubernetes session cluster. To support this, we need to improve the REST API to make it able to submit Python Table API jobs. 2) Support to run a specific job cluster on Kubernetes. To support this, we need to improve the job specific docker image build scripts to support Python Table API jobs. Currently, I have submitted two PRs using the same Jira ticket. Does it make sense to you to focus on part 1) in this ticket and creating another ticket for part 2)? > Add deploy a Python Flink job and session cluster on Kubernetes support. > > > Key: FLINK-12541 > URL: https://issues.apache.org/jira/browse/FLINK-12541 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Runtime / REST >Affects Versions: 1.9.0 >Reporter: sunjincheng >Assignee: Dian Fu >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Add deploy a Python Flink job and session cluster on Kubernetes support. > We need to have the same deployment step as the Java job. Please see: > [https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12767) Support user defined connectors/format
Dian Fu created FLINK-12767: --- Summary: Support user defined connectors/format Key: FLINK-12767 URL: https://issues.apache.org/jira/browse/FLINK-12767 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Assignee: Dian Fu Currently, only built-in connectors such as FileSystem/Kafka/ES are supported and only built-in formats such as OldCSV/JSON/Avro/CSV/ are supported. We should also provide a convenient way for the connectors/formats that are not built-in supported. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12821) Fix the bug that fix time quantifier can not be the last element of a pattern
[ https://issues.apache.org/jira/browse/FLINK-12821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-12821: Description: Currently, exception "Greedy quantifiers are not allowed as the last element of a Pattern yet. Finish your pattern with either a simple variable or reluctant quantifier." will be thrown for patterns such as "a\{2}". Actually greedy property is not meaningful for this kind of pattern. (was: Currently, exception "Greedy quantifiers are not allowed as the last element of a Pattern yet. Finish your pattern with either a simple variable or reluctant quantifier." will be thrown for pattern "a\{2}". Actually this pattern is not greedy and we should fix it.) > Fix the bug that fix time quantifier can not be the last element of a pattern > - > > Key: FLINK-12821 > URL: https://issues.apache.org/jira/browse/FLINK-12821 > Project: Flink > Issue Type: Sub-task > Components: Library / CEP, Table SQL / API >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, exception "Greedy quantifiers are not allowed as the last element > of a Pattern yet. Finish your pattern with either a simple variable or > reluctant quantifier." will be thrown for patterns such as "a\{2}". Actually > greedy property is not meaningful for this kind of pattern. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12821) Fix the bug that fix time quantifier can not be the last element of a pattern
[ https://issues.apache.org/jira/browse/FLINK-12821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-12821: Fix Version/s: 1.8.1 1.9.0 1.7.3 > Fix the bug that fix time quantifier can not be the last element of a pattern > - > > Key: FLINK-12821 > URL: https://issues.apache.org/jira/browse/FLINK-12821 > Project: Flink > Issue Type: Sub-task > Components: Library / CEP, Table SQL / API >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > Labels: pull-request-available > Fix For: 1.7.3, 1.9.0, 1.8.1 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently, exception "Greedy quantifiers are not allowed as the last element > of a Pattern yet. Finish your pattern with either a simple variable or > reluctant quantifier." will be thrown for patterns such as "a\{2}". Actually > greedy property is not meaningful for this kind of pattern. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12821) Fix the bug that fix time quantifier can not be the last element of a pattern
[ https://issues.apache.org/jira/browse/FLINK-12821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-12821: Fix Version/s: (was: 1.8.1) (was: 1.7.3) (was: 1.9.0) > Fix the bug that fix time quantifier can not be the last element of a pattern > - > > Key: FLINK-12821 > URL: https://issues.apache.org/jira/browse/FLINK-12821 > Project: Flink > Issue Type: Sub-task > Components: Library / CEP, Table SQL / API >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, exception "Greedy quantifiers are not allowed as the last element > of a Pattern yet. Finish your pattern with either a simple variable or > reluctant quantifier." will be thrown for patterns such as "a\{2}". Actually > greedy property is not meaningful for this kind of pattern. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12788) Add support to run a Python job-specific cluster on Kubernetes
[ https://issues.apache.org/jira/browse/FLINK-12788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-12788: Labels: pull-request-available (was: ) > Add support to run a Python job-specific cluster on Kubernetes > -- > > Key: FLINK-12788 > URL: https://issues.apache.org/jira/browse/FLINK-12788 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Deployment / Docker >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > Labels: pull-request-available > > As discussed in FLINK-12541, we need to support to run a Python job-specific > cluster on Kubernetes. To support this, we need to improve the job specific > docker image build scripts to support Python Table API jobs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12803) Correct the package name for python API
[ https://issues.apache.org/jira/browse/FLINK-12803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860954#comment-16860954 ] Dian Fu commented on FLINK-12803: - Good catch. +1. > Correct the package name for python API > --- > > Key: FLINK-12803 > URL: https://issues.apache.org/jira/browse/FLINK-12803 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Affects Versions: 1.9.0 >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > > Currently the package name of flink APIs should cantians the language name, > such as: > * flink-java -> org.apache.flink.api.java > * flink-scala -> org.apache.flink.api.scala > So I think we should follow the pattern of API package name and correct the > current python API package name for `flink-python`, i.e., > * flink-python -> `org.apache.flink.python` ---> > `org.apache.flink.api.python` > What do you think? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12787) Allow to specify directory in option -pyfs
Dian Fu created FLINK-12787: --- Summary: Allow to specify directory in option -pyfs Key: FLINK-12787 URL: https://issues.apache.org/jira/browse/FLINK-12787 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Assignee: Dian Fu Current only files can be specified in option `-pyfs`, we want to improve it allow also specify directories in option `-pyfs`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12788) Add support to run a Python job-specific cluster on Kubernetes
Dian Fu created FLINK-12788: --- Summary: Add support to run a Python job-specific cluster on Kubernetes Key: FLINK-12788 URL: https://issues.apache.org/jira/browse/FLINK-12788 Project: Flink Issue Type: Sub-task Components: API / Python, Deployment / Docker Reporter: Dian Fu Assignee: Dian Fu As discussed in FLINK-12541, we need to support to run a Python job-specific cluster on Kubernetes. To support this, we need to improve the job specific docker image build scripts to support Python Table API jobs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12541) Add deploy a Python Flink job and session cluster on Kubernetes support.
[ https://issues.apache.org/jira/browse/FLINK-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859718#comment-16859718 ] Dian Fu commented on FLINK-12541: - [~sunjincheng121] Thanks a lot for the suggestions. I have create a ticket FLINK-12788 for the part2 changes. > Add deploy a Python Flink job and session cluster on Kubernetes support. > > > Key: FLINK-12541 > URL: https://issues.apache.org/jira/browse/FLINK-12541 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Runtime / REST >Affects Versions: 1.9.0 >Reporter: sunjincheng >Assignee: Dian Fu >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Add deploy a Python Flink job and session cluster on Kubernetes support. > We need to have the same deployment step as the Java job. Please see: > [https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12821) Fix the bug that fix time quantifier can not be the last element of a pattern
Dian Fu created FLINK-12821: --- Summary: Fix the bug that fix time quantifier can not be the last element of a pattern Key: FLINK-12821 URL: https://issues.apache.org/jira/browse/FLINK-12821 Project: Flink Issue Type: Sub-task Components: Library / CEP, Table SQL / API Reporter: Dian Fu Assignee: Dian Fu Currently, exception "Greedy quantifiers are not allowed as the last element of a Pattern yet. Finish your pattern with either a simple variable or reluctant quantifier." will be thrown for pattern "a\{2}". Actually this pattern is not greedy and we should fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12910) Fix the Python catalog test issue
[ https://issues.apache.org/jira/browse/FLINK-12910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868468#comment-16868468 ] Dian Fu commented on FLINK-12910: - This issue is introduced in [https://github.com/apache/flink/pull/8786]. Will provide a fix ASAP. > Fix the Python catalog test issue > - > > Key: FLINK-12910 > URL: https://issues.apache.org/jira/browse/FLINK-12910 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > > self = testMethod=test_table_exists> > > def test_table_exists(self): > self.catalog.create_database(self.db1, self.create_db(), False) > > pyflink/table/tests/test_catalog.py:491: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > > @staticmethod > def create_db(): > gateway = get_gateway() > j_database = gateway.jvm.GenericCatalogDatabase(\{"k1": "v1"}, > CatalogTestBase.test_comment) > E TypeError: 'JavaPackage' object is not callable > > > pyflink/table/tests/test_catalog.py:78: TypeError -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12910) Fix the Python catalog test issue
Dian Fu created FLINK-12910: --- Summary: Fix the Python catalog test issue Key: FLINK-12910 URL: https://issues.apache.org/jira/browse/FLINK-12910 Project: Flink Issue Type: Bug Components: API / Python Reporter: Dian Fu Assignee: Dian Fu self = def test_table_exists(self): self.catalog.create_database(self.db1, self.create_db(), False) pyflink/table/tests/test_catalog.py:491: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ @staticmethod def create_db(): gateway = get_gateway() j_database = gateway.jvm.GenericCatalogDatabase(\{"k1": "v1"}, CatalogTestBase.test_comment) E TypeError: 'JavaPackage' object is not callable pyflink/table/tests/test_catalog.py:78: TypeError -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12897) Improve the Python Table API docs by adding more examples
Dian Fu created FLINK-12897: --- Summary: Improve the Python Table API docs by adding more examples Key: FLINK-12897 URL: https://issues.apache.org/jira/browse/FLINK-12897 Project: Flink Issue Type: Sub-task Components: API / Python, Documentation Reporter: Dian Fu As discussed in [https://github.com/apache/flink/pull/8774], we need to improve the Python Table API docs by adding more examples. Currently, a few APIs have no examples. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12920) Drop support for register_table_sink with field_names and field_types parameters
Dian Fu created FLINK-12920: --- Summary: Drop support for register_table_sink with field_names and field_types parameters Key: FLINK-12920 URL: https://issues.apache.org/jira/browse/FLINK-12920 Project: Flink Issue Type: Bug Components: API / Python Reporter: Dian Fu Assignee: Dian Fu The following registerTableSink API in TableEnvironment is deprecated: {code:java} @Deprecated void registerTableSink(String name, String[] fieldNames, TypeInformation[] fieldTypes, TableSink tableSink); {code} We can drop the support of it in Python Table API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-12931) lint-python.sh cannot find flake8
[ https://issues.apache.org/jira/browse/FLINK-12931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu reassigned FLINK-12931: --- Assignee: Dian Fu (was: sunjincheng) > lint-python.sh cannot find flake8 > - > > Key: FLINK-12931 > URL: https://issues.apache.org/jira/browse/FLINK-12931 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.9.0 >Reporter: Bowen Li >Assignee: Dian Fu >Priority: Major > Fix For: 1.9.0 > > > Hi guys, > I tried to run tests for flink-python with {{./dev/lint-python.sh}} by > following README. But it reported it couldn't find flake8, error as > {code:java} > ./dev/lint-python.sh: line 490: > /.../flink/flink-python/dev/.conda/bin/flake8: No such file or directory > {code} > I've tried {{./dev/lint-python.sh -f}}, also didn't work. > I suspect the reason may be that I already have an anaconda3 installed and it > conflicts with the miniconda installed by flink-python somehow. I'm not fully > sure about that. > If that's the reason, I think we need to try to resolve the conflict because > anaconda is a pretty common package that developers install and use. We > shouldn't require devs to uninstall their existing conda environment in order > to develop flink-python and run its tests. It's better if flink-python can > have a well isolated environment on machines. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12962) Allows pyflink to be pip installed
Dian Fu created FLINK-12962: --- Summary: Allows pyflink to be pip installed Key: FLINK-12962 URL: https://issues.apache.org/jira/browse/FLINK-12962 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu The aim of this JIRA is to support to build a pip installable package. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12585) Align Stream/BatchTableEnvironment with JAVA Table API
[ https://issues.apache.org/jira/browse/FLINK-12585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845494#comment-16845494 ] Dian Fu commented on FLINK-12585: - Agree. Having a Python StreamExecutionEnvironment/ExecutionEnvironment is necessary as there are a lot of important APIs for a job in the Java StreamExecutionEnvironment/ExecutionEnvironment such as setStateBackend, setRestartStrategy, etc. > Align Stream/BatchTableEnvironment with JAVA Table API > -- > > Key: FLINK-12585 > URL: https://issues.apache.org/jira/browse/FLINK-12585 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Affects Versions: 1.9.0 >Reporter: sunjincheng >Priority: Major > > Initially we wanted to align with the > [FLIP-32|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions]] > plan and Unify the TableEnvironment. such as: > {code:java} > TableConfig config = TableConfig.builder() > .asStreamingExecution() > // example of providing configuration that was in > StreamExecutionEnvironment before > .watermarkInterval(100) > .build(); > TableEnvironment tEnv = TableEnvironment.create(config);{code} > So, Current Python Table API as follows: > {code:java} > self.t_config = > TableConfig.Builder().as_streaming_execution().set_parallelism(1).build() > self.t_env = TableEnvironment.create(self.t_config){code} > But, due to Java API not have done this improve yet, and the end date of > release 1.9 is coming, So, It's better to align the > `Stream/BatchTableEnvironment` with JAVA Table API for now. we should follow > the current Java style, for example: > {code:java} > val env = StreamExecutionEnvironment.getExecutionEnvironment > env.setStateBackend(getStateBackend) > val tEnv = StreamTableEnvironment.create(env){code} > What to do you think? [~dian.fu] [~WeiZhong] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12609) Align the data types of Python with Java
[ https://issues.apache.org/jira/browse/FLINK-12609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-12609: Description: As discussed in https://github.com/apache/flink/pull/8420#issuecomment-495444623, currently, there are some data types defined in Java not supported in Python such as TIMESTAMP_WITH_TIME_ZONE, TIMESTAMP_WITH_LOCAL_TIME_ZONE, etc. We should support them in Python once these types have been fully supported in Java. was: Currently, there are some data types defined in Java not supported in Python such as TIMESTAMP_WITH_TIME_ZONE, TIMESTAMP_WITH_LOCAL_TIME_ZONE, etc. We should support them in Python once these types have been fully supported in Java. > Align the data types of Python with Java > > > Key: FLINK-12609 > URL: https://issues.apache.org/jira/browse/FLINK-12609 > Project: Flink > Issue Type: Sub-task >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > > As discussed in > https://github.com/apache/flink/pull/8420#issuecomment-495444623, currently, > there are some data types defined in Java not supported in Python such as > TIMESTAMP_WITH_TIME_ZONE, TIMESTAMP_WITH_LOCAL_TIME_ZONE, etc. We should > support them in Python once these types have been fully supported in Java. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12609) Align the data types of Python with Java
Dian Fu created FLINK-12609: --- Summary: Align the data types of Python with Java Key: FLINK-12609 URL: https://issues.apache.org/jira/browse/FLINK-12609 Project: Flink Issue Type: Sub-task Reporter: Dian Fu Assignee: Dian Fu Currently, there are some data types defined in Java not supported in Python such as TIMESTAMP_WITH_TIME_ZONE, TIMESTAMP_WITH_LOCAL_TIME_ZONE, etc. We should support them in Python once these types have been fully supported in Java. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12609) Align the Python data types with Java
[ https://issues.apache.org/jira/browse/FLINK-12609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-12609: Summary: Align the Python data types with Java (was: Align the data types of Python with Java) > Align the Python data types with Java > - > > Key: FLINK-12609 > URL: https://issues.apache.org/jira/browse/FLINK-12609 > Project: Flink > Issue Type: Sub-task >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > > As discussed in > https://github.com/apache/flink/pull/8420#issuecomment-495444623, currently, > there are some data types defined in Java not supported in Python such as > TIMESTAMP_WITH_TIME_ZONE, TIMESTAMP_WITH_LOCAL_TIME_ZONE, etc. We should > support them in Python once these types have been fully supported in Java. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12456) Wrong CharType convertion in type_utils.py
[ https://issues.apache.org/jira/browse/FLINK-12456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847539#comment-16847539 ] Dian Fu commented on FLINK-12456: - [~lzljs3620320] Thanks a lot for reporting this issue. Good catch! This issue should have been fixed as part of the work FLINK-12408. Could you help to confirm if FLINK-12408 have solved this issue? If so, I guess we can close this ticket. > Wrong CharType convertion in type_utils.py > -- > > Key: FLINK-12456 > URL: https://issues.apache.org/jira/browse/FLINK-12456 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Jingsong Lee >Priority: Major > > In types.py: define CharType as SQL CHAR, but SQL CHAR is a Java String > instead of Java Character. > In type_utils.py, > org.apache.flink.api.common.typeinfo.Types.CHAR map to DataTypes.CHAR, it is > wrong. Types.CHAR is Java Character. > I suggest that consider removing Char's support first. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11052) Add Bounded(Group Window) FlatAggregate operator to batch Table API
[ https://issues.apache.org/jira/browse/FLINK-11052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837117#comment-16837117 ] Dian Fu commented on FLINK-11052: - I'm afraid I have no time to work on this issue these days. Feel free to take it. > Add Bounded(Group Window) FlatAggregate operator to batch Table API > --- > > Key: FLINK-11052 > URL: https://issues.apache.org/jira/browse/FLINK-11052 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: sunjincheng >Assignee: Dian Fu >Priority: Major > > Add FlatAggregate operator to *batch* group window Table API as described in > [FLIP-29|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=97552739]. > The usage: > {code:java} > tab.window(Tumble/Session/Slide... as 'w) > .groupBy('w, 'k1, 'k2) > .flatAggregate(tableAggregate('a)) > .select('w.rowtime, 'k1, 'k2, 'col1, 'col2) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12455) Move the packaging of pyflink to flink-dist
[ https://issues.apache.org/jira/browse/FLINK-12455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-12455: Summary: Move the packaging of pyflink to flink-dist (was: Move the package of pyflink to flink-dist) > Move the packaging of pyflink to flink-dist > --- > > Key: FLINK-12455 > URL: https://issues.apache.org/jira/browse/FLINK-12455 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > > Currently, there is a pom.xml under module flink-python which is responsible > for the package of pyflink. The package logic should be moved to flink-dist > and then we can remove the pom.xml under flink-python and make flink-python a > pure python module. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12455) Move the package of pyflink to flink-dist
Dian Fu created FLINK-12455: --- Summary: Move the package of pyflink to flink-dist Key: FLINK-12455 URL: https://issues.apache.org/jira/browse/FLINK-12455 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Assignee: Dian Fu Currently, there is a pom.xml under module flink-python which is responsible for the package of pyflink. The package logic should be moved to flink-dist and then we can remove the pom.xml under flink-python and make flink-python a pure python module. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12409) Adds from_elements in TableEnvironment
Dian Fu created FLINK-12409: --- Summary: Adds from_elements in TableEnvironment Key: FLINK-12409 URL: https://issues.apache.org/jira/browse/FLINK-12409 Project: Flink Issue Type: Sub-task Reporter: Dian Fu Assignee: Dian Fu This is a convenient method to create a table from a collection of elements. It works as follows: 1) Serializes the python objects to a local file 2) Loads the file in Java and deserializes the data to Java objects -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12408) Support to define all kinds of types in Python API
Dian Fu created FLINK-12408: --- Summary: Support to define all kinds of types in Python API Key: FLINK-12408 URL: https://issues.apache.org/jira/browse/FLINK-12408 Project: Flink Issue Type: Sub-task Reporter: Dian Fu Assignee: Dian Fu The aim of this ticket is to: 1) Allows users to define all kinds of types in Python API besides the primitive types which are already supported, such as map, array, row, user defined type, etc. 2) Adds basic utilities which can be used to infer types from data, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-10976) Add Aggregate operator to Table API
[ https://issues.apache.org/jira/browse/FLINK-10976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu reassigned FLINK-10976: --- Assignee: (was: Dian Fu) > Add Aggregate operator to Table API > --- > > Key: FLINK-10976 > URL: https://issues.apache.org/jira/browse/FLINK-10976 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: sunjincheng >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Add Aggregate operator to Table API as described in [Google > doc|https://docs.google.com/document/d/1tnpxg31EQz2-MEzSotwFzqatsB4rNLz0I-l_vPa5H4Q/edit#heading=h.q23rny2iglsr]. > The usage: > {code:java} > val res = tab > .groupBy('a) // leave out groupBy-clause to define global aggregates > .agg(fun: AggregateFunction) // output has columns 'a, 'b, 'c > .select('a, 'c) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-13085) unable to run python test locally
[ https://issues.apache.org/jira/browse/FLINK-13085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu closed FLINK-13085. --- Resolution: Cannot Reproduce [~phoenixjiangnan] As explained above, I think this is not a bug. Just close this ticket. Pls feel free to reopen it if you found this problem still exists. Thanks a lot. > unable to run python test locally > - > > Key: FLINK-13085 > URL: https://issues.apache.org/jira/browse/FLINK-13085 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Bowen Li >Assignee: sunjincheng >Priority: Major > Fix For: 1.9.0 > > > Ran ./dev/lint-python.sh and got: > {code:java} > === FAILURES > === > __ ExecutionConfigTests.test_equals_and_hash > ___ > self = testMethod=test_equals_and_hash> > def test_equals_and_hash(self): > config1 = > ExecutionEnvironment.get_execution_environment().get_config() > config2 = > ExecutionEnvironment.get_execution_environment().get_config() > self.assertEqual(config1, config2) > self.assertEqual(hash(config1), hash(config2)) > config1.set_parallelism(12) > self.assertNotEqual(config1, config2) > > self.assertNotEqual(hash(config1), hash(config2)) > E AssertionError: -1960065877 == -1960065877 > pyflink/common/tests/test_execution_config.py:293: AssertionError > __ ExecutionEnvironmentTests.test_get_execution_plan > ___ > self = > testMethod=test_get_execution_plan> > def test_get_execution_plan(self): > tmp_dir = tempfile.gettempdir() > source_path = os.path.join(tmp_dir + '/streaming.csv') > tmp_csv = os.path.join(tmp_dir + '/streaming2.csv') > field_names = ["a", "b", "c"] > field_types = [DataTypes.INT(), DataTypes.STRING(), > DataTypes.STRING()] > t_env = BatchTableEnvironment.create(self.env) > csv_source = CsvTableSource(source_path, field_names, field_types) > t_env.register_table_source("Orders", csv_source) > t_env.register_table_sink( > "Results", > CsvTableSink(field_names, field_types, tmp_csv)) > > t_env.scan("Orders").insert_into("Results") > pyflink/dataset/tests/test_execution_environment.py:111: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > pyflink/table/table.py:583: in insert_into > self._j_table.insertInto(table_path, j_table_path) > .tox/py27/lib/python2.7/site-packages/py4j/java_gateway.py:1286: in __call__ > answer, self.gateway_client, self.target_id, self.name) > pyflink/util/exceptions.py:139: in deco > return f(*a, **kw) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > answer = 'xro290' > gateway_client = > target_id = 'o288', name = 'insertInto' > def get_return_value(answer, gateway_client, target_id=None, name=None): > """Converts an answer received from the Java gateway into a Python > object. > For example, string representation of integers are converted to Python > integer, string representation of objects are converted to JavaObject > instances, etc. > :param answer: the string returned by the Java gateway > :param gateway_client: the gateway client used to communicate with > the Java > Gateway. Only necessary if the answer is a reference (e.g., > object, > list, map) > :param target_id: the name of the object from which the answer comes > from > (e.g., *object1* in `object1.hello()`). Optional. > :param name: the name of the member from which the answer comes from > (e.g., *hello* in `object1.hello()`). Optional. > """ > if is_error(answer)[0]: > if len(answer) > 1: > type = answer[1] > value = OUTPUT_CONVERTER[type](answer[2:], gateway_client) > if answer[1] == REFERENCE_TYPE: > raise Py4JJavaError( > "An error occurred while calling {0}{1}{2}.\n". > > format(target_id, ".", name), value) > E Py4JJavaError: An error occurred while calling > o288.insertInto. > E : java.lang.NullPointerException > E at > org.apache.flink.api.common.io.FileOutputFormat.setWriteMode(FileOutputFormat.java:146) > E at > org.apache.flink.api.java.DataSet.writeAsText(DataSet.java:1510) > E at > org.apache.flink.table.sinks.CsvTableSink.emitDataSet(CsvTableSink.scala:76) > E at >
[jira] [Closed] (FLINK-12456) Wrong CharType convertion in type_utils.py
[ https://issues.apache.org/jira/browse/FLINK-12456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu closed FLINK-12456. --- Resolution: Duplicate [~lzljs3620320] This issue should have been resolved in FLINK-12408. Just close this ticket. Feel free to reopen it if you found this problem still exists. Thanks a lot. > Wrong CharType convertion in type_utils.py > -- > > Key: FLINK-12456 > URL: https://issues.apache.org/jira/browse/FLINK-12456 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Jingsong Lee >Priority: Major > > In types.py: define CharType as SQL CHAR, but SQL CHAR is a Java String > instead of Java Character. > In type_utils.py, > org.apache.flink.api.common.typeinfo.Types.CHAR map to DataTypes.CHAR, it is > wrong. Types.CHAR is Java Character. > I suggest that consider removing Char's support first. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-13011) Release the PyFlink into PyPI
[ https://issues.apache.org/jira/browse/FLINK-13011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874684#comment-16874684 ] Dian Fu commented on FLINK-13011: - [~sunjincheng121] [~Zentol] I have contacted the owner of [https://pypi.org/project/pyflink/] and have got the ownership of this project. > Release the PyFlink into PyPI > - > > Key: FLINK-13011 > URL: https://issues.apache.org/jira/browse/FLINK-13011 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Build System >Affects Versions: 1.9.0 >Reporter: sunjincheng >Priority: Major > > FLINK-12962 adds the ability to build a PyFlink distribution package, but we > have not yet released PyFlink to PyPI. The goal of JIRA is to publish the > PyFlinjk distribution package built by FLINK-12962 to PyPI. > https://pypi.org/ > https://packaging.python.org/tutorials/packaging-projects/ > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-13085) unable to run python test locally
[ https://issues.apache.org/jira/browse/FLINK-13085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878255#comment-16878255 ] Dian Fu commented on FLINK-13085: - [~phoenixjiangnan] The exception is because the interfaces of Java CSVTableSource has changed and so you need to rebuild the Java package with "mvn clean package -DskipTests" before executing "./dev/lint-python.sh". > unable to run python test locally > - > > Key: FLINK-13085 > URL: https://issues.apache.org/jira/browse/FLINK-13085 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Bowen Li >Assignee: sunjincheng >Priority: Major > Fix For: 1.9.0 > > > Ran ./dev/lint-python.sh and got: > {code:java} > === FAILURES > === > __ ExecutionConfigTests.test_equals_and_hash > ___ > self = testMethod=test_equals_and_hash> > def test_equals_and_hash(self): > config1 = > ExecutionEnvironment.get_execution_environment().get_config() > config2 = > ExecutionEnvironment.get_execution_environment().get_config() > self.assertEqual(config1, config2) > self.assertEqual(hash(config1), hash(config2)) > config1.set_parallelism(12) > self.assertNotEqual(config1, config2) > > self.assertNotEqual(hash(config1), hash(config2)) > E AssertionError: -1960065877 == -1960065877 > pyflink/common/tests/test_execution_config.py:293: AssertionError > __ ExecutionEnvironmentTests.test_get_execution_plan > ___ > self = > testMethod=test_get_execution_plan> > def test_get_execution_plan(self): > tmp_dir = tempfile.gettempdir() > source_path = os.path.join(tmp_dir + '/streaming.csv') > tmp_csv = os.path.join(tmp_dir + '/streaming2.csv') > field_names = ["a", "b", "c"] > field_types = [DataTypes.INT(), DataTypes.STRING(), > DataTypes.STRING()] > t_env = BatchTableEnvironment.create(self.env) > csv_source = CsvTableSource(source_path, field_names, field_types) > t_env.register_table_source("Orders", csv_source) > t_env.register_table_sink( > "Results", > CsvTableSink(field_names, field_types, tmp_csv)) > > t_env.scan("Orders").insert_into("Results") > pyflink/dataset/tests/test_execution_environment.py:111: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > pyflink/table/table.py:583: in insert_into > self._j_table.insertInto(table_path, j_table_path) > .tox/py27/lib/python2.7/site-packages/py4j/java_gateway.py:1286: in __call__ > answer, self.gateway_client, self.target_id, self.name) > pyflink/util/exceptions.py:139: in deco > return f(*a, **kw) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > answer = 'xro290' > gateway_client = > target_id = 'o288', name = 'insertInto' > def get_return_value(answer, gateway_client, target_id=None, name=None): > """Converts an answer received from the Java gateway into a Python > object. > For example, string representation of integers are converted to Python > integer, string representation of objects are converted to JavaObject > instances, etc. > :param answer: the string returned by the Java gateway > :param gateway_client: the gateway client used to communicate with > the Java > Gateway. Only necessary if the answer is a reference (e.g., > object, > list, map) > :param target_id: the name of the object from which the answer comes > from > (e.g., *object1* in `object1.hello()`). Optional. > :param name: the name of the member from which the answer comes from > (e.g., *hello* in `object1.hello()`). Optional. > """ > if is_error(answer)[0]: > if len(answer) > 1: > type = answer[1] > value = OUTPUT_CONVERTER[type](answer[2:], gateway_client) > if answer[1] == REFERENCE_TYPE: > raise Py4JJavaError( > "An error occurred while calling {0}{1}{2}.\n". > > format(target_id, ".", name), value) > E Py4JJavaError: An error occurred while calling > o288.insertInto. > E : java.lang.NullPointerException > E at > org.apache.flink.api.common.io.FileOutputFormat.setWriteMode(FileOutputFormat.java:146) > E at > org.apache.flink.api.java.DataSet.writeAsText(DataSet.java:1510) > E at > org.apache.flink.table.sinks.CsvTableSink.emitDataSet(CsvTableSink.scala:76) > E at >
[jira] [Assigned] (FLINK-12326) Add a basic test framework, just like the existing Java TableAPI, abstract some TestBase.
[ https://issues.apache.org/jira/browse/FLINK-12326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu reassigned FLINK-12326: --- Assignee: Dian Fu > Add a basic test framework, just like the existing Java TableAPI, abstract > some TestBase. > - > > Key: FLINK-12326 > URL: https://issues.apache.org/jira/browse/FLINK-12326 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Affects Versions: 1.9.0 >Reporter: sunjincheng >Assignee: Dian Fu >Priority: Major > > Add a basic test framework, just like the existing Java/Scala TableAPI, > abstract some TestBase. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12308) Support python language in Flink Table API
[ https://issues.apache.org/jira/browse/FLINK-12308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825698#comment-16825698 ] Dian Fu commented on FLINK-12308: - Hi [~sunjincheng121] that's very kind of you. It is great to participate in this feature and looking forward to collaborating with you. > Support python language in Flink Table API > -- > > Key: FLINK-12308 > URL: https://issues.apache.org/jira/browse/FLINK-12308 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.9.0 >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > > At the Flink API level, we have DataStreamAPI/DataSetAPI/TableAPI, the > Table API will become the first-class citizen. Table API is declarative, and > can be automatically optimized, which is mentioned in the Flink mid-term > roadmap by Stephan. So, first considering supporting Python at the Table > level to cater to the current large number of analytics users. And Flink's > goal for Python Table API as follows: > * Users can write Flink Table API job in Python, and should mirror Java / > Scala Table API > * Users can submit Python Table API job in the following ways: > ** Submit a job with python script, integrate with `flink run` > ** Submit a job with python script by REST service > ** Submit a job in an interactive way, similar `scala-shell` > ** Local debug in IDE. > * Users can write custom functions(UDF, UDTF, UDAF) > * Pandas functions can be used in Flink Python Table API > A more detailed description can be found in > [FLIP-38.|https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API] > For the API level, we make the following plan: > * The short-term: > We may initially go with a simple approach to map the Python Table API to > the Java Table API via Py4J. > * The long-term: > We may need to create a Python API that follows the same structure as > Flink's Table API that produces the language-independent DAG. (As Stephan > already motioned on the [mailing > thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html#a28096]) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-12308) Support python language in Flink Table API
[ https://issues.apache.org/jira/browse/FLINK-12308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824834#comment-16824834 ] Dian Fu commented on FLINK-12308: - Thanks [~sunjincheng121] for creating this ticket. A big +1 on this feature. > Support python language in Flink Table API > -- > > Key: FLINK-12308 > URL: https://issues.apache.org/jira/browse/FLINK-12308 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API >Affects Versions: 1.9.0 >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > > At the Flink API level, we have DataStreamAPI/DataSetAPI/TableAPI, the > Table API will become the first-class citizen. Table API is declarative, and > can be automatically optimized, which is mentioned in the Flink mid-term > roadmap by Stephan. So, first considering supporting Python at the Table > level to cater to the current large number of analytics users. And Flink's > goal for Python Table API as follows: > * Users can write Flink Table API job in Python, and should mirror Java / > Scala Table API > * Users can submit Python Table API job in the following ways: > ** Submit a job with python script, integrate with `flink run` > ** Submit a job with python script by REST service > ** Submit a job in an interactive way, similar `scala-shell` > ** Local debug in IDE. > * Users can write custom functions(UDF, UDTF, UDAF) > * Pandas functions can be used in Flink Python Table API > A more detailed description can be found in FLIP-38(Will be done soon). > For the API level, we make the following plan: > * The short-term: > We may initially go with a simple approach to map the Python Table API to > the Java Table API via Py4J. > * The long-term: > We may need to create a Python API that follows the same structure as > Flink's Table API that produces the language-independent DAG. (As Stephan > already motioned on the [mailing > thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html#a28096]) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12991) Correct the implementation of Catalog.get_table_factory
[ https://issues.apache.org/jira/browse/FLINK-12991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-12991: Description: The following method is added in catalog.py in FLINK-11480: {code} def get_table_factory(self): """ Get an optional TableFactory instance that's responsible for generating source/sink for tables stored in this catalog. :return: An optional TableFactory instance. """ return self._j_catalog.getTableFactory() {code} There is some problem with the implementation as it returns a Java TableFactory and this is not friendly for Python users. We should correct the implementation. Before doing that, we need to make sure the following thing: Is this method designed to be used by users or will only be used internally? I take a quick look at the code and it seems to me that this method will only be used internally. Maybe [~xuefuz] and [~phoenixjiangnan] can provide more information about this. If this method is designed to be used by users directly, we need to provide a Python TableFactory wrapper and makes sure this method is usable for Python users. If this method is designed to be only used internally, then we need to remove it from the Python catalog. For the API completeness test, we can add this method to *excluded_methods* of *CatalogAPICompletenessTests* to make the tests passed. was: The following method is added in catalog.py in FLINK-11480: {code:java} def get_table_factory(self): """ Get an optional TableFactory instance that's responsible for generating source/sink for tables stored in this catalog. :return: An optional TableFactory instance. """ return self._j_catalog.getTableFactory() {code} There is some problem with the implementation as it returns a Java TableFactory and this is not friendly for Python users. We should correct the implementation. Before doing that, we need to make sure the following thing: Is this method designed to be used by users or will only be used internally? I take a quick look at the code and it seems to me that this method will only be used internally. Maybe [~xuefuz] and [~phoenixjiangnan] can provide more information about this. If this method is designed to be used by users directly, we need to provide a Python TableFactory wrapper and makes sure this method is usable for Python users. If this method is designed to be only used internally, then we need to remove it from the Python catalog. For the API completeness test, we can add this method to *excluded_methods* of *CatalogAPICompletenessTests* to make the tests passed. > Correct the implementation of Catalog.get_table_factory > --- > > Key: FLINK-12991 > URL: https://issues.apache.org/jira/browse/FLINK-12991 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Dian Fu >Priority: Major > > The following method is added in catalog.py in FLINK-11480: > {code} > def get_table_factory(self): > """ > Get an optional TableFactory instance that's responsible for generating > source/sink for tables stored in this catalog. > > :return: An optional TableFactory instance. > """ > return self._j_catalog.getTableFactory() > {code} > There is some problem with the implementation as it returns a Java > TableFactory and this is not friendly for Python users. We should correct the > implementation. > Before doing that, we need to make sure the following thing: > Is this method designed to be used by users or will only be used internally? > I take a quick look at the code and it seems to me that this method will only > be used internally. Maybe [~xuefuz] and [~phoenixjiangnan] can provide more > information about this. > If this method is designed to be used by users directly, we need to provide a > Python TableFactory wrapper and makes sure this method is usable for Python > users. If this method is designed to be only used internally, then we need to > remove it from the Python catalog. For the API completeness test, we can add > this method to *excluded_methods* of *CatalogAPICompletenessTests* to make > the tests passed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-12991) Correct the implementation of Catalog.get_table_factory
[ https://issues.apache.org/jira/browse/FLINK-12991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-12991: Description: The following method is added in catalog.py in FLINK-11480: {code:java} def get_table_factory(self): """ Get an optional TableFactory instance that's responsible for generating source/sink for tables stored in this catalog. :return: An optional TableFactory instance. """ return self._j_catalog.getTableFactory() {code} There is some problem with the implementation as it returns a Java TableFactory and this is not friendly for Python users. We should correct the implementation. Before doing that, we need to make sure the following thing: Is this method designed to be used by users or will only be used internally? I take a quick look at the code and it seems to me that this method will only be used internally. Maybe [~xuefuz] and [~phoenixjiangnan] can provide more information about this. If this method is designed to be used by users directly, we need to provide a Python TableFactory wrapper and makes sure this method is usable for Python users. If this method is designed to be only used internally, then we need to remove it from the Python catalog. For the API completeness test, we can add this method to *excluded_methods* of *CatalogAPICompletenessTests* to make the tests passed. was: The following method is added in catalog.py in FLINK-11480: {code:java} def get_table_factory(self): """ Get an optional TableFactory instance that's responsible for generating source/sink for tables stored in this catalog. :return: An optional TableFactory instance. """ return self._j_catalog.getTableFactory() {code} There is some problem with the implementation as it returns a Java TableFactory and this is not friendly for Python users. We should correct the implementation. Before doing that, we need to make sure the following thing: Is this method designed to be used by users or it will only be used internally? I take a quick look at the code and it seems to me that this method will only be used internally. Maybe [~xuefuz] and [~phoenixjiangnan] can provide more information about this? If this method will be used by users directly, we need to provide a Python TableFactory wrapper and makes sure the this method is usable for Python users. If this method is designed to be only used internally, then we need to remove it from the Python catalog. For the API completeness test, we can add this method to excluded_methods of CatalogAPICompletenessTests to make the tests passed. > Correct the implementation of Catalog.get_table_factory > --- > > Key: FLINK-12991 > URL: https://issues.apache.org/jira/browse/FLINK-12991 > Project: Flink > Issue Type: Bug > Components: API / Python >Reporter: Dian Fu >Priority: Major > > The following method is added in catalog.py in FLINK-11480: > {code:java} > def get_table_factory(self): > """ > Get an optional TableFactory instance that's responsible for > generating source/sink for tables stored in this catalog. :return: An > optional TableFactory instance. > """ > return self._j_catalog.getTableFactory() > {code} > There is some problem with the implementation as it returns a Java > TableFactory and this is not friendly for Python users. We should correct the > implementation. > Before doing that, we need to make sure the following thing: > Is this method designed to be used by users or will only be used internally? > I take a quick look at the code and it seems to me that this method will only > be used internally. Maybe [~xuefuz] and [~phoenixjiangnan] can provide more > information about this. > If this method is designed to be used by users directly, we need to provide a > Python TableFactory wrapper and makes sure this method is usable for Python > users. If this method is designed to be only used internally, then we need to > remove it from the Python catalog. For the API completeness test, we can add > this method to *excluded_methods* of *CatalogAPICompletenessTests* to make > the tests passed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12990) Date type doesn't consider the local TimeZone
Dian Fu created FLINK-12990: --- Summary: Date type doesn't consider the local TimeZone Key: FLINK-12990 URL: https://issues.apache.org/jira/browse/FLINK-12990 Project: Flink Issue Type: Bug Components: API / Python Reporter: Dian Fu Assignee: Dian Fu Currently, the python DateType is converted by an `int` which indicates the days passed since 1970-1-1 and then the Java side will create a Java Date by call `new Date(days * 86400)`. As we know that the Date constructor expected milliseconds since 1970-1-1 00:00:00 GMT and so we should convert `days * 86400` to GMT milliseconds. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-12991) Correct the implementation of Catalog.get_table_factory
Dian Fu created FLINK-12991: --- Summary: Correct the implementation of Catalog.get_table_factory Key: FLINK-12991 URL: https://issues.apache.org/jira/browse/FLINK-12991 Project: Flink Issue Type: Bug Components: API / Python Reporter: Dian Fu The following method is added in catalog.py in FLINK-11480: {code:java} def get_table_factory(self): """ Get an optional TableFactory instance that's responsible for generating source/sink for tables stored in this catalog. :return: An optional TableFactory instance. """ return self._j_catalog.getTableFactory() {code} There is some problem with the implementation as it returns a Java TableFactory and this is not friendly for Python users. We should correct the implementation. Before doing that, we need to make sure the following thing: Is this method designed to be used by users or it will only be used internally? I take a quick look at the code and it seems to me that this method will only be used internally. Maybe [~xuefuz] and [~phoenixjiangnan] can provide more information about this? If this method will be used by users directly, we need to provide a Python TableFactory wrapper and makes sure the this method is usable for Python users. If this method is designed to be only used internally, then we need to remove it from the Python catalog. For the API completeness test, we can add this method to excluded_methods of CatalogAPICompletenessTests to make the tests passed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-13743) Port PythonTableUtils to flink-python module
Dian Fu created FLINK-13743: --- Summary: Port PythonTableUtils to flink-python module Key: FLINK-13743 URL: https://issues.apache.org/jira/browse/FLINK-13743 Project: Flink Issue Type: Task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 Currently *PythonTableUtils* is located in *flink-table-planner* module, however, it is shared by both the flink planner and the blink planner. It makes more sense to move it *flink-python* module. It also indicates that we should change it from Scala to Java. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Comment Edited] (FLINK-13488) flink-python fails to build on Travis due to PackagesNotFoundError
[ https://issues.apache.org/jira/browse/FLINK-13488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916370#comment-16916370 ] Dian Fu edited comment on FLINK-13488 at 8/27/19 5:01 AM: -- Hi [~ykt836], I guess this fix is actually contained in [1.9.0|https://github.com/apache/flink/commits/release-1.9.0?after=9c32ed989c0178a2bf3e059e897927c451188700+190]. Pls correct me if I missed something:) Do you mean the PR [https://github.com/apache/flink/pull/9431] ? I guess it refers a wrong Jira number and linked to this Jira by mistake. [~gjy] was (Author: dian.fu): Hi [~ykt836], I guess this fix is actually contained in [1.9.0|https://github.com/apache/flink/commits/release-1.9.0?after=9c32ed989c0178a2bf3e059e897927c451188700+190]. Pls correct me if I missed something:) > flink-python fails to build on Travis due to PackagesNotFoundError > -- > > Key: FLINK-13488 > URL: https://issues.apache.org/jira/browse/FLINK-13488 > Project: Flink > Issue Type: Bug > Components: API / Python, Test Infrastructure >Reporter: Tzu-Li (Gordon) Tai >Assignee: sunjincheng >Priority: Blocker > Labels: pull-request-available > Fix For: 1.10.0, 1.9.1 > > Time Spent: 1h > Remaining Estimate: 0h > > https://api.travis-ci.org/v3/job/564925115/log.txt > {code} > install conda ... [SUCCESS] > install miniconda... [SUCCESS] > installing python environment... > installing python2.7... > install python2.7... [SUCCESS] > installing python3.3... > PackagesNotFoundError: The following packages are not available from current > channels: > - python=3.3 > Current channels: > - https://repo.anaconda.com/pkgs/main/linux-64 > - https://repo.anaconda.com/pkgs/main/noarch > - https://repo.anaconda.com/pkgs/r/linux-64 > - https://repo.anaconda.com/pkgs/r/noarch > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (FLINK-13488) flink-python fails to build on Travis due to PackagesNotFoundError
[ https://issues.apache.org/jira/browse/FLINK-13488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916370#comment-16916370 ] Dian Fu commented on FLINK-13488: - Hi [~ykt836], I guess this fix is actually contained in [1.9.0|https://github.com/apache/flink/commits/release-1.9.0?after=9c32ed989c0178a2bf3e059e897927c451188700+190]. Pls correct me if I missed something:) > flink-python fails to build on Travis due to PackagesNotFoundError > -- > > Key: FLINK-13488 > URL: https://issues.apache.org/jira/browse/FLINK-13488 > Project: Flink > Issue Type: Bug > Components: API / Python, Test Infrastructure >Reporter: Tzu-Li (Gordon) Tai >Assignee: sunjincheng >Priority: Blocker > Labels: pull-request-available > Fix For: 1.10.0, 1.9.1 > > Time Spent: 1h > Remaining Estimate: 0h > > https://api.travis-ci.org/v3/job/564925115/log.txt > {code} > install conda ... [SUCCESS] > install miniconda... [SUCCESS] > installing python environment... > installing python2.7... > install python2.7... [SUCCESS] > installing python3.3... > PackagesNotFoundError: The following packages are not available from current > channels: > - python=3.3 > Current channels: > - https://repo.anaconda.com/pkgs/main/linux-64 > - https://repo.anaconda.com/pkgs/main/noarch > - https://repo.anaconda.com/pkgs/r/linux-64 > - https://repo.anaconda.com/pkgs/r/noarch > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (FLINK-6935) Integration of SQL and CEP
[ https://issues.apache.org/jira/browse/FLINK-6935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917548#comment-16917548 ] Dian Fu commented on FLINK-6935: Hi [~libenchao] Most of the features mentioned there is available in blink table planner. The missing features are mainly the ones which has used extended SQL grammar which are not among the standard SQL. > Integration of SQL and CEP > -- > > Key: FLINK-6935 > URL: https://issues.apache.org/jira/browse/FLINK-6935 > Project: Flink > Issue Type: New Feature > Components: Library / CEP, Table SQL / API >Reporter: Jark Wu >Assignee: Dian Fu >Priority: Major > > Flink's CEP library is a great library for complex event processing, more and > more customers are expressing their interests in it. But it also has some > limitations that users usually have to write a lot of code even for a very > simple pattern match use case as it currently only supports the Java API. > CEP DSLs and SQLs strongly resemble each other. CEP's additional features > compared to SQL boil down to pattern detection. So It will be awesome to > consolidate CEP and SQL. It makes SQL more powerful to support more usage > scenario. And it gives users the ability to easily and quickly to build CEP > applications. > The FLIP can be found here: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-20%3A+Integration+of+SQL+and+CEP > This is an umbrella issue for the FLIP. We should wait for Calcite 1.13 to > start this work. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (FLINK-14066) bug of building pyflink in master and 1.9.0 version
[ https://issues.apache.org/jira/browse/FLINK-14066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928369#comment-16928369 ] Dian Fu commented on FLINK-14066: - Hi [~coldmoon777], currently PyFlink is still not supported on Windows. So I'm afraid that there may be also other issues beside this one, i.e. the corresponding window scripts for pyflink-gateway-server.sh is needed to run on windows (There is an ticket FLINK-12717 for this). > bug of building pyflink in master and 1.9.0 version > --- > > Key: FLINK-14066 > URL: https://issues.apache.org/jira/browse/FLINK-14066 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 1.9.0, 1.10.0 > Environment: windows 10 enterprise x64 > powershell x64 > flink source master and 1.9.0 version > jdk-8u202 > maven-3.2.5 >Reporter: Xu Yang >Priority: Blocker > Labels: beginner, build > Attachments: setup.py > > Original Estimate: 1h > Remaining Estimate: 1h > > During we build pyflink... > After we have built flink from flink source code, a folder named "target" is > generated. > Then, following the document description, "cd flink-python; python3 setup.py > sdist bdist_wheel", error happens. > Root cause: in the setup.py file, line 75, "FLINK_HOME = > os.path.abspath("../build-target")", the program can't found folder > "build-target", however, the building of flink generated a folder named > "target". So error happens in this way... > > The right way: > in ../flink-python/setup.py line 75, modify code as following: > FLINK_HOME = os.path.abspath("../target") -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (FLINK-14066) pyflink building failure in master and 1.9.0 version
[ https://issues.apache.org/jira/browse/FLINK-14066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928507#comment-16928507 ] Dian Fu commented on FLINK-14066: - I mean that for windows support, we need to consider both building and running pyflink on windows. If you want to just build pyflink on windows(do not need to run it on windows), then we can firstly fix the build issue. If you also have requirements to run pyflink on windows, then we need also add the corresponding window scripts for scripts such as pyflink-gateway-server.sh. > pyflink building failure in master and 1.9.0 version > > > Key: FLINK-14066 > URL: https://issues.apache.org/jira/browse/FLINK-14066 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 1.9.0, 1.10.0 > Environment: windows 10 enterprise x64(mentioned as build > environment, not development environment.) > powershell x64 > flink source master and 1.9.0 version > jdk-8u202 > maven-3.2.5 >Reporter: Xu Yang >Priority: Blocker > Labels: beginner, build > Attachments: setup.py > > Original Estimate: 1h > Remaining Estimate: 1h > > ATTENTION: This is a issue about building pyflink, not development. > During we build pyflink... > After we have built flink from flink source code, a folder named "target" is > generated. > Then, following the document description, "cd flink-python; python3 setup.py > sdist bdist_wheel", error happens. > Root cause: in the setup.py file, line 75, "FLINK_HOME = > os.path.abspath("../build-target")", the program can't found folder > "build-target", however, the building of flink generated a folder named > "target". So error happens in this way... > > The right way: > in ../flink-python/setup.py line 75, modify code as following: > FLINK_HOME = os.path.abspath("../target") -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14013) Support Flink Python User-Defined Stateless Function for Table
Dian Fu created FLINK-14013: --- Summary: Support Flink Python User-Defined Stateless Function for Table Key: FLINK-14013 URL: https://issues.apache.org/jira/browse/FLINK-14013 Project: Flink Issue Type: New Feature Components: API / Python, Table SQL / API Affects Versions: 1.10.0 Reporter: Dian Fu Fix For: 1.10.0 The Python Table API has been supported in release 1.9.0. See the [FLIP-38|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API]] and FLINK-12308 for details. However, currently Python user-defined functions are still not supported. In this FLIP, we want to support stateless Python user-defined functions in Python Table API. More detailed description can be found in [FLIP-58|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table]]. The discussion can be found in [mailing thread|[http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-User-Defined-Function-for-Table-API-td31673.html]]. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14015) Introduce PythonScalarFunctionOperator as a standalone StreamOperator for Python ScalarFunction execution
Dian Fu created FLINK-14015: --- Summary: Introduce PythonScalarFunctionOperator as a standalone StreamOperator for Python ScalarFunction execution Key: FLINK-14015 URL: https://issues.apache.org/jira/browse/FLINK-14015 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 PythonScalarFunctionOperator is a standalone StreamOperator and it doesn’t need to how the Python ScalarFunctions are executed which is the responsibility of PythonScalarFunctionRunner: # It is a StreamOperator which employs PythonScalarFunctionRunner for Python ScalarFunction execution # It sends input elements to PythonScalarFunctionRunner, fetches the execution results, constructs the result rows and sends them to the downstream operator # It should handle the checkpoint and watermark properly -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (FLINK-14013) Support Flink Python User-Defined Stateless Function for Table
[ https://issues.apache.org/jira/browse/FLINK-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-14013: Description: The Python Table API has been supported in release 1.9.0. See the [FLIP-38|https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API] and FLINK-12308 for details. However, currently Python user-defined functions are still not supported. In this FLIP, we want to support stateless Python user-defined functions in Python Table API. More detailed description could be found in [FLIP-58|https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table]. The discussion could be found in [mailing thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-User-Defined-Function-for-Table-API-td31673.html]. was: The Python Table API has been supported in release 1.9.0. See the [FLIP-38|https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API] and FLINK-12308 for details. However, currently Python user-defined functions are still not supported. In this FLIP, we want to support stateless Python user-defined functions in Python Table API. More detailed description can be found in [FLIP-58|https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table]. The discussion can be found in [mailing thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-User-Defined-Function-for-Table-API-td31673.html]. > Support Flink Python User-Defined Stateless Function for Table > -- > > Key: FLINK-14013 > URL: https://issues.apache.org/jira/browse/FLINK-14013 > Project: Flink > Issue Type: New Feature > Components: API / Python, Table SQL / API >Affects Versions: 1.10.0 >Reporter: Dian Fu >Priority: Major > Fix For: 1.10.0 > > > The Python Table API has been supported in release 1.9.0. See the > [FLIP-38|https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API] > and FLINK-12308 for details. However, currently Python user-defined > functions are still not supported. In this FLIP, we want to support stateless > Python user-defined functions in Python Table API. > More detailed description could be found in > [FLIP-58|https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table]. > The discussion could be found in [mailing > thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-User-Defined-Function-for-Table-API-td31673.html]. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (FLINK-14013) Support Flink Python User-Defined Stateless Function for Table
[ https://issues.apache.org/jira/browse/FLINK-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-14013: Description: The Python Table API has been supported in release 1.9.0. See the [FLIP-38|https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API] and FLINK-12308 for details. However, currently Python user-defined functions are still not supported. In this FLIP, we want to support stateless Python user-defined functions in Python Table API. More detailed description can be found in [FLIP-58|https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table]. The discussion can be found in [mailing thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-User-Defined-Function-for-Table-API-td31673.html]. was: The Python Table API has been supported in release 1.9.0. See the [FLIP-38|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API]] and FLINK-12308 for details. However, currently Python user-defined functions are still not supported. In this FLIP, we want to support stateless Python user-defined functions in Python Table API. More detailed description can be found in [FLIP-58|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table]]. The discussion can be found in [mailing thread|[http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-User-Defined-Function-for-Table-API-td31673.html]]. > Support Flink Python User-Defined Stateless Function for Table > -- > > Key: FLINK-14013 > URL: https://issues.apache.org/jira/browse/FLINK-14013 > Project: Flink > Issue Type: New Feature > Components: API / Python, Table SQL / API >Affects Versions: 1.10.0 >Reporter: Dian Fu >Priority: Major > Fix For: 1.10.0 > > > The Python Table API has been supported in release 1.9.0. See the > [FLIP-38|https://cwiki.apache.org/confluence/display/FLINK/FLIP-38%3A+Python+Table+API] > and FLINK-12308 for details. However, currently Python user-defined > functions are still not supported. In this FLIP, we want to support stateless > Python user-defined functions in Python Table API. > More detailed description can be found in > [FLIP-58|https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table]. > The discussion can be found in [mailing > thread|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-Python-User-Defined-Function-for-Table-API-td31673.html]. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14014) Introduce PythonScalarFunctionRunner to handle the communication with Python worker for Python ScalarFunction execution
Dian Fu created FLINK-14014: --- Summary: Introduce PythonScalarFunctionRunner to handle the communication with Python worker for Python ScalarFunction execution Key: FLINK-14014 URL: https://issues.apache.org/jira/browse/FLINK-14014 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 PythonScalarFunctionRunner is responsible for Python ScalarFunction execution and it only handles the Python ScalarFunction execution and nothing else. So its logic should be very simple, forwarding an input element to Python worker and fetching the execution results back: # Internally, it uses Apache Beam’s portability for Python UDF execution and this is transparent for the caller of PythonScalarFunctionRunner # By default, each runner will startup a separate Python worker # The Python worker can run in a docker, a separate process or even an non-managed external service. # It has the ability to execute multiple Python ScalarFunctions # It also supports chained Python ScalarFunctions -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14016) Introduce RelNodes FlinkLogicalPythonScalarFunctionExec and DataStreamPythonScalarFunctionExec which are containers for Python PythonScalarFunctions
Dian Fu created FLINK-14016: --- Summary: Introduce RelNodes FlinkLogicalPythonScalarFunctionExec and DataStreamPythonScalarFunctionExec which are containers for Python PythonScalarFunctions Key: FLINK-14016 URL: https://issues.apache.org/jira/browse/FLINK-14016 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 Dedicated RelNodes such as FlinkLogicalPythonScalarFunctionExec and DataStreamPythonScalarFunctionExec should be introduced for Python ScalarFunction execution. These nodes exists as containers for Python ScalarFunctions which could be executed in a batch and then we can employ PythonScalarFunctionOperator for Python ScalarFunction execution. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-13999) Correct the documentation of MATCH_RECOGNIZE
Dian Fu created FLINK-13999: --- Summary: Correct the documentation of MATCH_RECOGNIZE Key: FLINK-13999 URL: https://issues.apache.org/jira/browse/FLINK-13999 Project: Flink Issue Type: Bug Components: Documentation Reporter: Dian Fu Regarding to the following [example|[https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/streaming/match_recognize.html#aggregations]] in the doc: {code:java} SELECT * FROM Ticker MATCH_RECOGNIZE ( PARTITION BY symbol ORDER BY rowtime MEASURES FIRST(A.rowtime) AS start_tstamp, LAST(A.rowtime) AS end_tstamp, AVG(A.price) AS avgPrice ONE ROW PER MATCH AFTER MATCH SKIP TO FIRST B PATTERN (A+ B) DEFINE A AS AVG(A.price) < 15 ) MR; {code} Given the inputs shown in the doc, it should be: {code:java} symbol start_tstamp end_tstamp avgPrice = == == ACME 01-APR-11 10:00:00 01-APR-11 10:00:03 14.5{code} instead of: {code:java} symbol start_tstamp end_tstamp avgPrice = == == ACME 01-APR-11 10:00:00 01-APR-11 10:00:03 14.5 ACME 01-APR-11 10:00:04 01-APR-11 10:00:09 13.5 {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (FLINK-13999) Correct the documentation of MATCH_RECOGNIZE
[ https://issues.apache.org/jira/browse/FLINK-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-13999: Description: Regarding to the following [example|#aggregations]] in the doc: {code:java} SELECT * FROM Ticker MATCH_RECOGNIZE ( PARTITION BY symbol ORDER BY rowtime MEASURES FIRST(A.rowtime) AS start_tstamp, LAST(A.rowtime) AS end_tstamp, AVG(A.price) AS avgPrice ONE ROW PER MATCH AFTER MATCH SKIP TO FIRST B PATTERN (A+ B) DEFINE A AS AVG(A.price) < 15 ) MR; {code} Given the inputs shown in the doc, it should be: {code:java} symbol start_tstamp end_tstamp avgPrice = == == ACME 01-APR-11 10:00:00 01-APR-11 10:00:03 14.5{code} instead of: {code:java} symbol start_tstamp end_tstamp avgPrice = == == ACME 01-APR-11 10:00:00 01-APR-11 10:00:03 14.5 ACME 01-APR-11 10:00:04 01-APR-11 10:00:09 13.5 {code} was: Regarding to the following [example|[https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/streaming/match_recognize.html#aggregations]] in the doc: {code:java} SELECT * FROM Ticker MATCH_RECOGNIZE ( PARTITION BY symbol ORDER BY rowtime MEASURES FIRST(A.rowtime) AS start_tstamp, LAST(A.rowtime) AS end_tstamp, AVG(A.price) AS avgPrice ONE ROW PER MATCH AFTER MATCH SKIP TO FIRST B PATTERN (A+ B) DEFINE A AS AVG(A.price) < 15 ) MR; {code} Given the inputs shown in the doc, it should be: {code:java} symbol start_tstamp end_tstamp avgPrice = == == ACME 01-APR-11 10:00:00 01-APR-11 10:00:03 14.5{code} instead of: {code:java} symbol start_tstamp end_tstamp avgPrice = == == ACME 01-APR-11 10:00:00 01-APR-11 10:00:03 14.5 ACME 01-APR-11 10:00:04 01-APR-11 10:00:09 13.5 {code} > Correct the documentation of MATCH_RECOGNIZE > > > Key: FLINK-13999 > URL: https://issues.apache.org/jira/browse/FLINK-13999 > Project: Flink > Issue Type: Bug > Components: Documentation >Reporter: Dian Fu >Priority: Major > > Regarding to the following [example|#aggregations]] in the doc: > {code:java} > SELECT * > FROM Ticker > MATCH_RECOGNIZE ( > PARTITION BY symbol > ORDER BY rowtime > MEASURES > FIRST(A.rowtime) AS start_tstamp, > LAST(A.rowtime) AS end_tstamp, > AVG(A.price) AS avgPrice > ONE ROW PER MATCH > AFTER MATCH SKIP TO FIRST B > PATTERN (A+ B) > DEFINE > A AS AVG(A.price) < 15 > ) MR; > {code} > Given the inputs shown in the doc, it should be: > {code:java} > symbol start_tstamp end_tstamp avgPrice > = == == > ACME 01-APR-11 10:00:00 01-APR-11 10:00:03 14.5{code} > instead of: > {code:java} > symbol start_tstamp end_tstamp avgPrice > = == == > ACME 01-APR-11 10:00:00 01-APR-11 10:00:03 14.5 > ACME 01-APR-11 10:00:04 01-APR-11 10:00:09 13.5 > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (FLINK-13999) Correct the documentation of MATCH_RECOGNIZE
[ https://issues.apache.org/jira/browse/FLINK-13999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924731#comment-16924731 ] Dian Fu commented on FLINK-13999: - This issue is reported in the user list: [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Wrong-result-of-MATCH-RECOGNIZE-clause-td29820.html] > Correct the documentation of MATCH_RECOGNIZE > > > Key: FLINK-13999 > URL: https://issues.apache.org/jira/browse/FLINK-13999 > Project: Flink > Issue Type: Bug > Components: Documentation >Reporter: Dian Fu >Priority: Major > > Regarding to the following > [example|[https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/streaming/match_recognize.html#aggregations]] > in the doc: > {code:java} > SELECT * > FROM Ticker > MATCH_RECOGNIZE ( > PARTITION BY symbol > ORDER BY rowtime > MEASURES > FIRST(A.rowtime) AS start_tstamp, > LAST(A.rowtime) AS end_tstamp, > AVG(A.price) AS avgPrice > ONE ROW PER MATCH > AFTER MATCH SKIP TO FIRST B > PATTERN (A+ B) > DEFINE > A AS AVG(A.price) < 15 > ) MR; > {code} > Given the inputs shown in the doc, it should be: > {code:java} > symbol start_tstamp end_tstamp avgPrice > = == == > ACME 01-APR-11 10:00:00 01-APR-11 10:00:03 14.5{code} > instead of: > {code:java} > symbol start_tstamp end_tstamp avgPrice > = == == > ACME 01-APR-11 10:00:00 01-APR-11 10:00:03 14.5 > ACME 01-APR-11 10:00:04 01-APR-11 10:00:09 13.5 > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14027) Add documentation for Python user-defined functions
Dian Fu created FLINK-14027: --- Summary: Add documentation for Python user-defined functions Key: FLINK-14027 URL: https://issues.apache.org/jira/browse/FLINK-14027 Project: Flink Issue Type: Sub-task Components: API / Python, Documentation Reporter: Dian Fu Fix For: 1.10.0 We should add documentation about how to use Python user-defined functions. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14022) Add validation check for places where Python ScalarFunction cannot be used
Dian Fu created FLINK-14022: --- Summary: Add validation check for places where Python ScalarFunction cannot be used Key: FLINK-14022 URL: https://issues.apache.org/jira/browse/FLINK-14022 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 Currently, there are places where Python ScalarFunction could not be used, for example: # Python UDF could not be used in MatchRecognize # Python UDFs could not be used in Join condition which take the columns from both the left table and the right table as inputs We should add validation check for places where it’s not supported. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14023) Support accessing job parameters in Python user-defined functions
Dian Fu created FLINK-14023: --- Summary: Support accessing job parameters in Python user-defined functions Key: FLINK-14023 URL: https://issues.apache.org/jira/browse/FLINK-14023 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 Currently, it’s possible to access job parameters in the Java user-defined functions. It could be used to define the behavior according to job parameters. It should also be supported for Python user-defined functions. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14017) Support to start up Python worker in process mode
Dian Fu created FLINK-14017: --- Summary: Support to start up Python worker in process mode Key: FLINK-14017 URL: https://issues.apache.org/jira/browse/FLINK-14017 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 We employ Apache Beam's portability frameowork for the Python UDF execution. However, there is only a golang implementation for the boot script to start up SDK harness in Beam. It’s used by both the Python SDK harness and the Go SDK harness. This is not a problem for Beam. However, it’s indeed a problem for Flink as it indicates that the whole stack of Beam’s Go SDK harness will be depended if we use the golang implementation of the boot script. We want to avoid this by adding a Python boot script. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14018) Add Python building blocks to make sure the basic functionality of Python ScalarFunction could work
Dian Fu created FLINK-14018: --- Summary: Add Python building blocks to make sure the basic functionality of Python ScalarFunction could work Key: FLINK-14018 URL: https://issues.apache.org/jira/browse/FLINK-14018 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 We need to add a few Python building blocks such as ScalarFunctionOperation, BigIntCoder, VarcharCoder, etc for Python ScalarFunction execution. ScalarFunctionOperation is subclass of Operation in Beam and BigIntCoder, VarcharCoder, etc are subclasses of Coder in Beam. These classes will be registered into the Beam’s portability framework to make sure they take effects. This PR makes sure that a basic end to end Python UDF could be executed. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14024) Support use-defined metrics in Python user-defined functions
Dian Fu created FLINK-14024: --- Summary: Support use-defined metrics in Python user-defined functions Key: FLINK-14024 URL: https://issues.apache.org/jira/browse/FLINK-14024 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 We should support users to define a few metrics in the Python user-defined functions. Beam’s portability framework has provided a framework for metrics report. We could make use of it. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14026) Manage the resource of Python worker properly
Dian Fu created FLINK-14026: --- Summary: Manage the resource of Python worker properly Key: FLINK-14026 URL: https://issues.apache.org/jira/browse/FLINK-14026 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 For a Flink Table API & SQL job, if it uses Python user-defined functions, the Java operator will launch separate Python process for Python user-defined function execution. We should make sure that the resources used by the Python process are managed by Flink’s resource management framework. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (FLINK-14020) User Apache Arrow as the serializer for data transmission between Java operator and Python harness
[ https://issues.apache.org/jira/browse/FLINK-14020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-14020: Description: Apache Arrow is "a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware". It has been widely used in many notable projects, such as Spark, Parquet, Pandas, etc. We should firstly benchmark whether it could improve the performance a lot for non-vectorized Python UDFs. If we see significant performance improvements, it would be great to use it for the Java/Python communication. Otherwise, record by record serializer will be used. was:Apache Arrow is "a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware". It has been widely used in many notable projects, such as Spark, Parquet, Pandas, etc. We could make use of Arrow as the data serializer between Java operator and Python harness. > User Apache Arrow as the serializer for data transmission between Java > operator and Python harness > -- > > Key: FLINK-14020 > URL: https://issues.apache.org/jira/browse/FLINK-14020 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > Fix For: 1.10.0 > > > Apache Arrow is "a cross-language development platform for in-memory data. It > specifies a standardized language-independent columnar memory format for flat > and hierarchical data, organized for efficient analytic operations on modern > hardware". It has been widely used in many notable projects, such as Spark, > Parquet, Pandas, etc. > We should firstly benchmark whether it could improve the performance a lot > for non-vectorized Python UDFs. If we see significant performance > improvements, it would be great to use it for the Java/Python communication. > Otherwise, record by record serializer will be used. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14025) Support to run the Python worker in docker mode
Dian Fu created FLINK-14025: --- Summary: Support to run the Python worker in docker mode Key: FLINK-14025 URL: https://issues.apache.org/jira/browse/FLINK-14025 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 The Python worker run in “Process” mode by default. Docker mode should be supported as well. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14019) Python environment and dependency management
Dian Fu created FLINK-14019: --- Summary: Python environment and dependency management Key: FLINK-14019 URL: https://issues.apache.org/jira/browse/FLINK-14019 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 A Python user-defined functions may depend on third party dependencies. We should provide a proper way to handle it: # Provide a way to let users specifying the dependencies # Provide a way to let users specifying the Python used -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14020) User Apache Arrow as the serializer for data transmission between Java operator and Python harness
Dian Fu created FLINK-14020: --- Summary: User Apache Arrow as the serializer for data transmission between Java operator and Python harness Key: FLINK-14020 URL: https://issues.apache.org/jira/browse/FLINK-14020 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 Apache Arrow is "a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware". It has been widely used in many notable projects, such as Spark, Parquet, Pandas, etc. We could make use of Arrow as the data serializer between Java operator and Python harness. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14021) Add rules to push down the Python ScalarFunctions contained in the join condition of Correlate node
Dian Fu created FLINK-14021: --- Summary: Add rules to push down the Python ScalarFunctions contained in the join condition of Correlate node Key: FLINK-14021 URL: https://issues.apache.org/jira/browse/FLINK-14021 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 The Python ScalarFunctions contained in the join condition of Correlate node should be extracted to make sure the TableFunction works well. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (FLINK-14028) Support logging aggregation in Python user-defined functions
Dian Fu created FLINK-14028: --- Summary: Support logging aggregation in Python user-defined functions Key: FLINK-14028 URL: https://issues.apache.org/jira/browse/FLINK-14028 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 Beam's portability framework has provided the ability to collecting log to operator from the Python workers. We should make use of this functionality to collect the logging of Python user-defined functions and output them into the logging file of Java operator process. Then users could access the logging generated by the Python user-defined functions easily. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (FLINK-13488) flink-python fails to build on Travis due to PackagesNotFoundError
[ https://issues.apache.org/jira/browse/FLINK-13488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895917#comment-16895917 ] Dian Fu commented on FLINK-13488: - [~sunjincheng121] +1 to drop the support for python 3.3 and 3.4. Python 3.3 has reached EOF on Sep 27, 2017 and Python 3.4 has reached EOF on Mar 18, 2019 and are not officially supported any more. > flink-python fails to build on Travis due to PackagesNotFoundError > -- > > Key: FLINK-13488 > URL: https://issues.apache.org/jira/browse/FLINK-13488 > Project: Flink > Issue Type: Bug > Components: API / Python, Test Infrastructure >Reporter: Tzu-Li (Gordon) Tai >Assignee: sunjincheng >Priority: Blocker > Fix For: 1.9.0 > > > https://api.travis-ci.org/v3/job/564925115/log.txt > {code} > install conda ... [SUCCESS] > install miniconda... [SUCCESS] > installing python environment... > installing python2.7... > install python2.7... [SUCCESS] > installing python3.3... > PackagesNotFoundError: The following packages are not available from current > channels: > - python=3.3 > Current channels: > - https://repo.anaconda.com/pkgs/main/linux-64 > - https://repo.anaconda.com/pkgs/main/noarch > - https://repo.anaconda.com/pkgs/r/linux-64 > - https://repo.anaconda.com/pkgs/r/noarch > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13488) flink-python fails to build on Travis due to PackagesNotFoundError
[ https://issues.apache.org/jira/browse/FLINK-13488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896098#comment-16896098 ] Dian Fu commented on FLINK-13488: - Make sense to me. > flink-python fails to build on Travis due to PackagesNotFoundError > -- > > Key: FLINK-13488 > URL: https://issues.apache.org/jira/browse/FLINK-13488 > Project: Flink > Issue Type: Bug > Components: API / Python, Test Infrastructure >Reporter: Tzu-Li (Gordon) Tai >Assignee: sunjincheng >Priority: Blocker > Labels: pull-request-available > Fix For: 1.9.0 > > Time Spent: 10m > Remaining Estimate: 0h > > https://api.travis-ci.org/v3/job/564925115/log.txt > {code} > install conda ... [SUCCESS] > install miniconda... [SUCCESS] > installing python environment... > installing python2.7... > install python2.7... [SUCCESS] > installing python3.3... > PackagesNotFoundError: The following packages are not available from current > channels: > - python=3.3 > Current channels: > - https://repo.anaconda.com/pkgs/main/linux-64 > - https://repo.anaconda.com/pkgs/main/noarch > - https://repo.anaconda.com/pkgs/r/linux-64 > - https://repo.anaconda.com/pkgs/r/noarch > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Assigned] (FLINK-13308) flink-python releases 2 jars
[ https://issues.apache.org/jira/browse/FLINK-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu reassigned FLINK-13308: --- Assignee: Dian Fu > flink-python releases 2 jars > > > Key: FLINK-13308 > URL: https://issues.apache.org/jira/browse/FLINK-13308 > Project: Flink > Issue Type: Bug > Components: API / Python, Build System >Affects Versions: 1.9.0 >Reporter: Chesnay Schepler >Assignee: Dian Fu >Priority: Blocker > Fix For: 1.9.0 > > > {{flink-python}} uses a classifier to differentiate itseld from the old > python API. turns out thsi doesn't work since it still tries to release a > normal unshaded flink-python jar. > We should drop the classifier, and either stick to flink-python or rename it > as proposed in FLINK-12776. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13308) flink-python releases 2 jars
[ https://issues.apache.org/jira/browse/FLINK-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886965#comment-16886965 ] Dian Fu commented on FLINK-13308: - Hi [~Zentol] Thanks a lot for opening this ticket. Either way makes sense to me and personally I'm in favor of dropping the classifier. Do you think it makes sense to you? > flink-python releases 2 jars > > > Key: FLINK-13308 > URL: https://issues.apache.org/jira/browse/FLINK-13308 > Project: Flink > Issue Type: Bug > Components: API / Python, Build System >Affects Versions: 1.9.0 >Reporter: Chesnay Schepler >Priority: Blocker > Fix For: 1.9.0 > > > {{flink-python}} uses a classifier to differentiate itseld from the old > python API. turns out thsi doesn't work since it still tries to release a > normal unshaded flink-python jar. > We should drop the classifier, and either stick to flink-python or rename it > as proposed in FLINK-12776. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13308) flink-python releases 2 jars
[ https://issues.apache.org/jira/browse/FLINK-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886978#comment-16886978 ] Dian Fu commented on FLINK-13308: - Sounds good. I'll take this issue and provide a PR ASAP. > flink-python releases 2 jars > > > Key: FLINK-13308 > URL: https://issues.apache.org/jira/browse/FLINK-13308 > Project: Flink > Issue Type: Bug > Components: API / Python, Build System >Affects Versions: 1.9.0 >Reporter: Chesnay Schepler >Priority: Blocker > Fix For: 1.9.0 > > > {{flink-python}} uses a classifier to differentiate itseld from the old > python API. turns out thsi doesn't work since it still tries to release a > normal unshaded flink-python jar. > We should drop the classifier, and either stick to flink-python or rename it > as proposed in FLINK-12776. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-14019) Python environment and dependency management
[ https://issues.apache.org/jira/browse/FLINK-14019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933348#comment-16933348 ] Dian Fu commented on FLINK-14019: - Looking forward to the design doc. This is a very important feature for Python UDF. > Python environment and dependency management > > > Key: FLINK-14019 > URL: https://issues.apache.org/jira/browse/FLINK-14019 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Dian Fu >Assignee: Wei Zhong >Priority: Major > Fix For: 1.10.0 > > > A Python user-defined functions may depend on third party dependencies. We > should provide a proper way to handle it: > # Provide a way to let users specifying the dependencies > # Provide a way to let users specifying the Python used -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14269) The close method does not work for UserDefinedFunction
Dian Fu created FLINK-14269: --- Summary: The close method does not work for UserDefinedFunction Key: FLINK-14269 URL: https://issues.apache.org/jira/browse/FLINK-14269 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 As discussed in, the close method of UserDefinedFunction does not work for now as it's not called due to issues of Beam's portability framework, i.e., the {{SdkHarness}} in sdk_worker.py need to call the stop method of {{SdkWorker}} so that the close method of UserDefinedFunction will be called. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-14269) The close method does not work for UserDefinedFunction
[ https://issues.apache.org/jira/browse/FLINK-14269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-14269: Description: As discussed in [https://github.com/apache/flink/pull/9766], the close method of UserDefinedFunction does not work for now as it's not called due to issues of Beam's portability framework, i.e., the {{SdkHarness}} in sdk_worker.py need to call the stop method of {{SdkWorker}} so that the close method of UserDefinedFunction will be called. (was: As discussed in, the close method of UserDefinedFunction does not work for now as it's not called due to issues of Beam's portability framework, i.e., the {{SdkHarness}} in sdk_worker.py need to call the stop method of {{SdkWorker}} so that the close method of UserDefinedFunction will be called. ) > The close method does not work for UserDefinedFunction > -- > > Key: FLINK-14269 > URL: https://issues.apache.org/jira/browse/FLINK-14269 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Dian Fu >Priority: Major > Fix For: 1.10.0 > > > As discussed in [https://github.com/apache/flink/pull/9766], the close method > of UserDefinedFunction does not work for now as it's not called due to issues > of Beam's portability framework, i.e., the {{SdkHarness}} in sdk_worker.py > need to call the stop method of {{SdkWorker}} so that the close method of > UserDefinedFunction will be called. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14272) Support Blink planner for Python UDF
Dian Fu created FLINK-14272: --- Summary: Support Blink planner for Python UDF Key: FLINK-14272 URL: https://issues.apache.org/jira/browse/FLINK-14272 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 Currently, the Python UDF only works in the legacy planner, we should also support it in the Blink planner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14269) The close method does not work for Python UserDefinedFunction
[ https://issues.apache.org/jira/browse/FLINK-14269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940303#comment-16940303 ] Dian Fu commented on FLINK-14269: - [~jark] Thanks for correction the title. The new title makes sense to me. > The close method does not work for Python UserDefinedFunction > - > > Key: FLINK-14269 > URL: https://issues.apache.org/jira/browse/FLINK-14269 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Dian Fu >Priority: Major > Fix For: 1.10.0 > > > As discussed in [https://github.com/apache/flink/pull/9766], the close method > of UserDefinedFunction does not work for now as it's not called due to issues > of Beam's portability framework, i.e., the {{SdkHarness}} in sdk_worker.py > need to call the stop method of {{SdkWorker}} so that the close method of > UserDefinedFunction will be called. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-14208) Optimize Python UDFs with parameters of constant values
[ https://issues.apache.org/jira/browse/FLINK-14208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-14208: Description: We need support Python UDFs with parameters of constant values. It should be noticed that the constant parameters are not needed to be transferred between the Java operator and the Python worker. (was: We need optimize Python UDFs with parameters of constant values. For example, the constant parameter isn't needed to be transferred between the Java operator and the Python worker.) > Optimize Python UDFs with parameters of constant values > --- > > Key: FLINK-14208 > URL: https://issues.apache.org/jira/browse/FLINK-14208 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Dian Fu >Priority: Major > Fix For: 1.10.0 > > > We need support Python UDFs with parameters of constant values. It should be > noticed that the constant parameters are not needed to be transferred between > the Java operator and the Python worker. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14240) The configuration added via TableConfig#configuration should be added to GlobalJobParameters for the legacy planner
Dian Fu created FLINK-14240: --- Summary: The configuration added via TableConfig#configuration should be added to GlobalJobParameters for the legacy planner Key: FLINK-14240 URL: https://issues.apache.org/jira/browse/FLINK-14240 Project: Flink Issue Type: Bug Components: Table SQL / Planner Reporter: Dian Fu Fix For: 1.10.0 TableConfig#configuration has been added in FLINK-12348 which allows users to add configurations to the underlying configuration. The added configuration will be added to GlobalJobParameters for blink planner. This should also be supported for the legacy planner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14178) maven-shade-plugin 3.2.1 doesn't work on ARM for Flink
[ https://issues.apache.org/jira/browse/FLINK-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936693#comment-16936693 ] Dian Fu commented on FLINK-14178: - Hi [~wangxiyuan], Thanks a lot for your great work. The maven-shade-plugin is bumped because it introduces the dependency of Beam for Python UDF support and maven-shade-plugin of version below 3.1.0 does not work well with ASM 6.0 (MSHADE-258) which is dependent by Beam. It bump it directly from 3.0.0 to 3.2.1 because we noticed that profile java11 has upgraded maven-shade-plugin to 3.2.1 as it only supports java11 since [3.2.1|[https://github.com/apache/flink/pull/9294]]. So downgrade maven-shade-plugin to 3.1.0 will not break the functionality of Python UDF and I'm +1 if downgrade is needed. Just want to point out that as maven-shade-plugin only supports java11 since 3.2.1 and it doesn't work on ARM for 3.2.1. It means that it's not possible to build flink on ARM for java11. Could you verify this? > maven-shade-plugin 3.2.1 doesn't work on ARM for Flink > -- > > Key: FLINK-14178 > URL: https://issues.apache.org/jira/browse/FLINK-14178 > Project: Flink > Issue Type: Sub-task > Components: Build System >Affects Versions: 2.0.0 >Reporter: wangxiyuan >Priority: Minor > Fix For: 2.0.0 > > > recently, maven-shade-plugin is bumped from 3.0.0 to 3.2.1 by the > [commit|https://github.com/apache/flink/commit/e7216eebc846a69272c21375af0f4db8009c2e3e]. > While with my test locally on ARM, The Flink build process will be jammed. > After debugging, I found there is an infinite loop. > Downgrade maven-shade-plugin to 3.1.0 can solve this problem. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-14178) maven-shade-plugin 3.2.1 doesn't work on ARM for Flink
[ https://issues.apache.org/jira/browse/FLINK-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936693#comment-16936693 ] Dian Fu edited comment on FLINK-14178 at 9/24/19 11:36 AM: --- Hi [~wangxiyuan], Thanks a lot for your great work. The maven-shade-plugin is bumped because it introduces the dependency of Beam for Python UDF support and maven-shade-plugin of version below 3.1.0 does not work well with ASM 6.0 (MSHADE-258) which is dependent by Beam. It bump it directly from 3.0.0 to 3.2.1 because we noticed that profile java11 has upgraded maven-shade-plugin to 3.2.1 as it only supports java11 since [3.2.1|https://github.com/apache/flink/pull/9294]. So downgrade maven-shade-plugin to 3.1.0 will not break the functionality of Python UDF and I'm +1 if downgrade is needed. Just want to point out that as maven-shade-plugin only supports java11 since 3.2.1 and it doesn't work on ARM for 3.2.1. It means that it's not possible to build flink on ARM for java11. Could you confirm this? was (Author: dian.fu): Hi [~wangxiyuan], Thanks a lot for your great work. The maven-shade-plugin is bumped because it introduces the dependency of Beam for Python UDF support and maven-shade-plugin of version below 3.1.0 does not work well with ASM 6.0 (MSHADE-258) which is dependent by Beam. It bump it directly from 3.0.0 to 3.2.1 because we noticed that profile java11 has upgraded maven-shade-plugin to 3.2.1 as it only supports java11 since [3.2.1|https://github.com/apache/flink/pull/9294]. So downgrade maven-shade-plugin to 3.1.0 will not break the functionality of Python UDF and I'm +1 if downgrade is needed. Just want to point out that as maven-shade-plugin only supports java11 since 3.2.1 and it doesn't work on ARM for 3.2.1. It means that it's not possible to build flink on ARM for java11. Could you verify this? > maven-shade-plugin 3.2.1 doesn't work on ARM for Flink > -- > > Key: FLINK-14178 > URL: https://issues.apache.org/jira/browse/FLINK-14178 > Project: Flink > Issue Type: Sub-task > Components: Build System >Affects Versions: 2.0.0 >Reporter: wangxiyuan >Priority: Minor > Fix For: 2.0.0 > > > recently, maven-shade-plugin is bumped from 3.0.0 to 3.2.1 by the > [commit|https://github.com/apache/flink/commit/e7216eebc846a69272c21375af0f4db8009c2e3e]. > While with my test locally on ARM, The Flink build process will be jammed. > After debugging, I found there is an infinite loop. > Downgrade maven-shade-plugin to 3.1.0 can solve this problem. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-14178) maven-shade-plugin 3.2.1 doesn't work on ARM for Flink
[ https://issues.apache.org/jira/browse/FLINK-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936693#comment-16936693 ] Dian Fu edited comment on FLINK-14178 at 9/24/19 11:35 AM: --- Hi [~wangxiyuan], Thanks a lot for your great work. The maven-shade-plugin is bumped because it introduces the dependency of Beam for Python UDF support and maven-shade-plugin of version below 3.1.0 does not work well with ASM 6.0 (MSHADE-258) which is dependent by Beam. It bump it directly from 3.0.0 to 3.2.1 because we noticed that profile java11 has upgraded maven-shade-plugin to 3.2.1 as it only supports java11 since [3.2.1|https://github.com/apache/flink/pull/9294]. So downgrade maven-shade-plugin to 3.1.0 will not break the functionality of Python UDF and I'm +1 if downgrade is needed. Just want to point out that as maven-shade-plugin only supports java11 since 3.2.1 and it doesn't work on ARM for 3.2.1. It means that it's not possible to build flink on ARM for java11. Could you verify this? was (Author: dian.fu): Hi [~wangxiyuan], Thanks a lot for your great work. The maven-shade-plugin is bumped because it introduces the dependency of Beam for Python UDF support and maven-shade-plugin of version below 3.1.0 does not work well with ASM 6.0 (MSHADE-258) which is dependent by Beam. It bump it directly from 3.0.0 to 3.2.1 because we noticed that profile java11 has upgraded maven-shade-plugin to 3.2.1 as it only supports java11 since [3.2.1|[https://github.com/apache/flink/pull/9294]]. So downgrade maven-shade-plugin to 3.1.0 will not break the functionality of Python UDF and I'm +1 if downgrade is needed. Just want to point out that as maven-shade-plugin only supports java11 since 3.2.1 and it doesn't work on ARM for 3.2.1. It means that it's not possible to build flink on ARM for java11. Could you verify this? > maven-shade-plugin 3.2.1 doesn't work on ARM for Flink > -- > > Key: FLINK-14178 > URL: https://issues.apache.org/jira/browse/FLINK-14178 > Project: Flink > Issue Type: Sub-task > Components: Build System >Affects Versions: 2.0.0 >Reporter: wangxiyuan >Priority: Minor > Fix For: 2.0.0 > > > recently, maven-shade-plugin is bumped from 3.0.0 to 3.2.1 by the > [commit|https://github.com/apache/flink/commit/e7216eebc846a69272c21375af0f4db8009c2e3e]. > While with my test locally on ARM, The Flink build process will be jammed. > After debugging, I found there is an infinite loop. > Downgrade maven-shade-plugin to 3.1.0 can solve this problem. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14226) Subset of nightly tests fail due to No output has been received in the last 10m0s
[ https://issues.apache.org/jira/browse/FLINK-14226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940631#comment-16940631 ] Dian Fu commented on FLINK-14226: - Hi [~sunjincheng121], I'm +1 to downgrade maven-shade-plugin to 3.1.0. Have created a [PR|https://github.com/apache/flink/pull/9817] , could you help to take a look? [~sunjincheng121] [~gjy] [~hequn8128]? > Subset of nightly tests fail due to No output has been received in the last > 10m0s > - > > Key: FLINK-14226 > URL: https://issues.apache.org/jira/browse/FLINK-14226 > Project: Flink > Issue Type: Bug > Components: Build System >Reporter: Gary Yao >Priority: Blocker > > https://travis-ci.org/apache/flink/builds/589469198 > https://api.travis-ci.org/v3/job/589469225/log.txt > {noformat} > 19:51:07.028 [INFO] --- maven-shade-plugin:3.2.1:shade (shade-flink) @ > flink-elasticsearch6-test --- > 19:51:07.038 [INFO] Excluding > org.apache.flink:flink-connector-elasticsearch6_2.11:jar:1.10-SNAPSHOT from > the shaded jar. > 19:51:07.045 [INFO] Excluding > org.apache.flink:flink-connector-elasticsearch-base_2.11:jar:1.10-SNAPSHOT > from the shaded jar. > 19:51:07.046 [INFO] Excluding > org.elasticsearch.client:elasticsearch-rest-high-level-client:jar:6.3.1 from > the shaded jar. > 19:51:07.046 [INFO] Excluding org.elasticsearch:elasticsearch:jar:6.3.1 from > the shaded jar. > 19:51:07.046 [INFO] Excluding org.elasticsearch:elasticsearch-core:jar:6.3.1 > from the shaded jar. > 19:51:07.046 [INFO] Excluding > org.elasticsearch:elasticsearch-secure-sm:jar:6.3.1 from the shaded jar. > 19:51:07.046 [INFO] Excluding > org.elasticsearch:elasticsearch-x-content:jar:6.3.1 from the shaded jar. > 19:51:07.047 [INFO] Excluding org.yaml:snakeyaml:jar:1.17 from the shaded jar. > 19:51:07.047 [INFO] Excluding > com.fasterxml.jackson.core:jackson-core:jar:2.8.10 from the shaded jar. > 19:51:07.047 [INFO] Excluding > com.fasterxml.jackson.dataformat:jackson-dataformat-smile:jar:2.8.10 from the > shaded jar. > 19:51:07.047 [INFO] Excluding > com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:jar:2.8.10 from the > shaded jar. > 19:51:07.047 [INFO] Excluding > com.fasterxml.jackson.dataformat:jackson-dataformat-cbor:jar:2.8.10 from the > shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-core:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding > org.apache.lucene:lucene-analyzers-common:jar:7.3.1 from the shaded jar. > 19:51:07.047 [INFO] Excluding > org.apache.lucene:lucene-backward-codecs:jar:7.3.1 from the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-grouping:jar:7.3.1 > from the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-highlighter:jar:7.3.1 > from the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-join:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-memory:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-misc:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-queries:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-queryparser:jar:7.3.1 > from the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-sandbox:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-spatial:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding > org.apache.lucene:lucene-spatial-extras:jar:7.3.1 from the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-spatial3d:jar:7.3.1 > from the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-suggest:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.elasticsearch:elasticsearch-cli:jar:6.3.1 > from the shaded jar. > 19:51:07.047 [INFO] Excluding net.sf.jopt-simple:jopt-simple:jar:5.0.2 from > the shaded jar. > 19:51:07.047 [INFO] Excluding com.carrotsearch:hppc:jar:0.7.1 from the shaded > jar. > 19:51:07.047 [INFO] Excluding joda-time:joda-time:jar:2.5 from the shaded jar. > 19:51:07.047 [INFO] Excluding com.tdunning:t-digest:jar:3.2 from the shaded > jar. > 19:51:07.047 [INFO] Excluding org.hdrhistogram:HdrHistogram:jar:2.1.9 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.elasticsearch:jna:jar:4.5.1 from the shaded > jar. > 19:51:07.047 [INFO] Excluding > org.elasticsearch.client:elasticsearch-rest-client:jar:6.3.1 from the shaded > jar. > 19:51:07.047 [INFO] Excluding org.apache.httpcomponents:httpclient:jar:4.5.3 > from the shaded jar. > 19:51:07.048 [INFO] Excluding org.apache.httpcomponents:httpcore:jar:4.4.6 > from the shaded jar. > 19:51:07.048 [INFO] Excluding >
[jira] [Commented] (FLINK-14288) Add Py4j NOTICE for source release
[ https://issues.apache.org/jira/browse/FLINK-14288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940591#comment-16940591 ] Dian Fu commented on FLINK-14288: - Hi [~sunjincheng121], good catch! I'd like to take this issue. Would you please assign it to me? > Add Py4j NOTICE for source release > --- > > Key: FLINK-14288 > URL: https://issues.apache.org/jira/browse/FLINK-14288 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.9.0 >Reporter: sunjincheng >Priority: Blocker > Fix For: 1.9.1 > > > I just found that we should add Py4j NOTICE for source release. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14178) maven-shade-plugin 3.2.1 doesn't work on ARM for Flink
[ https://issues.apache.org/jira/browse/FLINK-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940626#comment-16940626 ] Dian Fu commented on FLINK-14178: - [~sunjincheng121] Downgrading the maven-shade-plugin to 3.1.0 makes sense to me. I'd like to take this ticket and fix it, could you assign this issue to me? > maven-shade-plugin 3.2.1 doesn't work on ARM for Flink > -- > > Key: FLINK-14178 > URL: https://issues.apache.org/jira/browse/FLINK-14178 > Project: Flink > Issue Type: Sub-task > Components: Build System >Affects Versions: 2.0.0 >Reporter: wangxiyuan >Priority: Minor > Fix For: 2.0.0 > > Attachments: debug.log > > > recently, maven-shade-plugin is bumped from 3.0.0 to 3.2.1 by the > [commit|https://github.com/apache/flink/commit/e7216eebc846a69272c21375af0f4db8009c2e3e]. > While with my test locally on ARM, The Flink build process will be jammed. > After debugging, I found there is an infinite loop. > Downgrade maven-shade-plugin to 3.1.0 can solve this problem. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14222) Optimize for Python UDFs with all parameters are constant values
Dian Fu created FLINK-14222: --- Summary: Optimize for Python UDFs with all parameters are constant values Key: FLINK-14222 URL: https://issues.apache.org/jira/browse/FLINK-14222 Project: Flink Issue Type: Sub-task Components: API / Python Reporter: Dian Fu Fix For: 1.10.0 As discussed in [https://github.com/apache/flink/pull/9766], The Python UDFs could be optimized to a constant value if it is deterministic. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-14208) Optimize Python UDFs with parameters of constant values
[ https://issues.apache.org/jira/browse/FLINK-14208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-14208: Summary: Optimize Python UDFs with parameters of constant values (was: Support Python UDF with parameters of constant values) > Optimize Python UDFs with parameters of constant values > --- > > Key: FLINK-14208 > URL: https://issues.apache.org/jira/browse/FLINK-14208 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Dian Fu >Priority: Major > Fix For: 1.10.0 > > > We need provide native support for Python UDF with parameters of constant > values. For example, the constant parameter isn't needed to be transferred > between the Java operator and the Python worker. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-14208) Optimize Python UDFs with parameters of constant values
[ https://issues.apache.org/jira/browse/FLINK-14208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu updated FLINK-14208: Description: We need optimize Python UDFs with parameters of constant values. For example, the constant parameter isn't needed to be transferred between the Java operator and the Python worker. (was: We need provide native support for Python UDF with parameters of constant values. For example, the constant parameter isn't needed to be transferred between the Java operator and the Python worker.) > Optimize Python UDFs with parameters of constant values > --- > > Key: FLINK-14208 > URL: https://issues.apache.org/jira/browse/FLINK-14208 > Project: Flink > Issue Type: Sub-task > Components: API / Python >Reporter: Dian Fu >Priority: Major > Fix For: 1.10.0 > > > We need optimize Python UDFs with parameters of constant values. For example, > the constant parameter isn't needed to be transferred between the Java > operator and the Python worker. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-14226) Subset of nightly tests fail due to No output has been received in the last 10m0s
[ https://issues.apache.org/jira/browse/FLINK-14226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940796#comment-16940796 ] Dian Fu commented on FLINK-14226: - [~trohrmann] Thanks for the remind. The cron test is still running, will let you know once it's finished. > Subset of nightly tests fail due to No output has been received in the last > 10m0s > - > > Key: FLINK-14226 > URL: https://issues.apache.org/jira/browse/FLINK-14226 > Project: Flink > Issue Type: Bug > Components: Build System >Reporter: Gary Yao >Priority: Blocker > > https://travis-ci.org/apache/flink/builds/589469198 > https://api.travis-ci.org/v3/job/589469225/log.txt > {noformat} > 19:51:07.028 [INFO] --- maven-shade-plugin:3.2.1:shade (shade-flink) @ > flink-elasticsearch6-test --- > 19:51:07.038 [INFO] Excluding > org.apache.flink:flink-connector-elasticsearch6_2.11:jar:1.10-SNAPSHOT from > the shaded jar. > 19:51:07.045 [INFO] Excluding > org.apache.flink:flink-connector-elasticsearch-base_2.11:jar:1.10-SNAPSHOT > from the shaded jar. > 19:51:07.046 [INFO] Excluding > org.elasticsearch.client:elasticsearch-rest-high-level-client:jar:6.3.1 from > the shaded jar. > 19:51:07.046 [INFO] Excluding org.elasticsearch:elasticsearch:jar:6.3.1 from > the shaded jar. > 19:51:07.046 [INFO] Excluding org.elasticsearch:elasticsearch-core:jar:6.3.1 > from the shaded jar. > 19:51:07.046 [INFO] Excluding > org.elasticsearch:elasticsearch-secure-sm:jar:6.3.1 from the shaded jar. > 19:51:07.046 [INFO] Excluding > org.elasticsearch:elasticsearch-x-content:jar:6.3.1 from the shaded jar. > 19:51:07.047 [INFO] Excluding org.yaml:snakeyaml:jar:1.17 from the shaded jar. > 19:51:07.047 [INFO] Excluding > com.fasterxml.jackson.core:jackson-core:jar:2.8.10 from the shaded jar. > 19:51:07.047 [INFO] Excluding > com.fasterxml.jackson.dataformat:jackson-dataformat-smile:jar:2.8.10 from the > shaded jar. > 19:51:07.047 [INFO] Excluding > com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:jar:2.8.10 from the > shaded jar. > 19:51:07.047 [INFO] Excluding > com.fasterxml.jackson.dataformat:jackson-dataformat-cbor:jar:2.8.10 from the > shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-core:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding > org.apache.lucene:lucene-analyzers-common:jar:7.3.1 from the shaded jar. > 19:51:07.047 [INFO] Excluding > org.apache.lucene:lucene-backward-codecs:jar:7.3.1 from the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-grouping:jar:7.3.1 > from the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-highlighter:jar:7.3.1 > from the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-join:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-memory:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-misc:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-queries:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-queryparser:jar:7.3.1 > from the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-sandbox:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-spatial:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding > org.apache.lucene:lucene-spatial-extras:jar:7.3.1 from the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-spatial3d:jar:7.3.1 > from the shaded jar. > 19:51:07.047 [INFO] Excluding org.apache.lucene:lucene-suggest:jar:7.3.1 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.elasticsearch:elasticsearch-cli:jar:6.3.1 > from the shaded jar. > 19:51:07.047 [INFO] Excluding net.sf.jopt-simple:jopt-simple:jar:5.0.2 from > the shaded jar. > 19:51:07.047 [INFO] Excluding com.carrotsearch:hppc:jar:0.7.1 from the shaded > jar. > 19:51:07.047 [INFO] Excluding joda-time:joda-time:jar:2.5 from the shaded jar. > 19:51:07.047 [INFO] Excluding com.tdunning:t-digest:jar:3.2 from the shaded > jar. > 19:51:07.047 [INFO] Excluding org.hdrhistogram:HdrHistogram:jar:2.1.9 from > the shaded jar. > 19:51:07.047 [INFO] Excluding org.elasticsearch:jna:jar:4.5.1 from the shaded > jar. > 19:51:07.047 [INFO] Excluding > org.elasticsearch.client:elasticsearch-rest-client:jar:6.3.1 from the shaded > jar. > 19:51:07.047 [INFO] Excluding org.apache.httpcomponents:httpclient:jar:4.5.3 > from the shaded jar. > 19:51:07.048 [INFO] Excluding org.apache.httpcomponents:httpcore:jar:4.4.6 > from the shaded jar. > 19:51:07.048 [INFO] Excluding > org.apache.httpcomponents:httpasyncclient:jar:4.1.2 from the shaded jar. > 19:51:07.048 [INFO] Excluding >