Re: [Discussion] FLIP-79 Flink Function DDL Support
Hi Bowen, I can't agree more about we first have an agreement on the DDL syntax and focus on the MVP in the current phase. 1) what's the syntax to distinguish function language Currently, there are two opinions: - USING 'python .' - [LANGUAGE JVM|PYTHON] USING JAR '...' As we need to support multiple resources as HQL, we shouldn't repeat the language symbol as a suffix of each resource. I would prefer option two, but definitely open to more comments. 2) How to persist function language in backend catalog? as a k-v pair in properties map, or a dedicate field? Even though language type is also a property, I think a separate field in CatalogFunction is a more clean solution. 3) do we really need to allow users set a properties map for udf? what needs to be stored there? what are they used for? I am considering a type of use case that use UDFS for realtime inference. The model is nested in the udf as a resource. But there are multiple parameters are customizable. In this way, user can use properties to define those parameters. I only have answers to these questions. For questions about the catalog implementation, I hope we can collect more feedback from the community. Best Regards Peter Huang Best Regards Peter Huang On Tue, Oct 29, 2019 at 11:31 AM Bowen Li wrote: > Hi all, > > Besides all the good questions raised above, we seem all agree to have a > MVP for Flink 1.10, "to support users to create and persist a java > class-based udf that's already in classpath (no extra resource loading), > and use it later in queries". > > IIUIC, to achieve that in 1.10, the following are currently the core > issues/blockers we should figure out, and solve them as our **highest > priority**: > > - what's the syntax to distinguish function language (java, scala, python, > etc)? we only need to implement the java one in 1.10 but have to settle > down the long term solution > - how to persist function language in backend catalog? as a k-v pair in > properties map, or a dedicate field? > - do we really need to allow users set a properties map for udf? what needs > to be stored there? what are they used for? > - should a catalog impl be able to decide whether it can take a properties > map (if we decide to have one), and which language of a udf it can persist? >- E.g. Hive metastore, which backs Flink's HiveCatalog, cannot take a > properties map and is only able to persist java udf [1], unless we do > something hacky to it > > I feel these questions are essential to Flink functions in the long run, > but most importantly, are also the minimum scope for Flink 1.10. Aspects > like resource loading security or compatibility with Hive syntax are > important too, however if we focus on them now, we may not be able to get > the MVP out in time. > > [1] > - > > https://hive.apache.org/javadocs/r3.1.2/api/org/apache/hadoop/hive/metastore/api/Function.html > - > > https://hive.apache.org/javadocs/r3.1.2/api/org/apache/hadoop/hive/metastore/api/FunctionType.html > > > > On Sun, Oct 27, 2019 at 8:22 PM Peter Huang > wrote: > > > Hi Timo, > > > > Thanks for the feedback. I replied and adjust the design accordingly. For > > the concern of class loading. > > I think we need to distinguish the function class loading for Temporary > and > > Permanent function. > > > > 1) For Permanent function, we can add it to the job graph so that we > don't > > need to load it multiple times for the different sessions. > > 2) For Temporary function, we can register function with a session key, > and > > use different class loaders in RuntimeContext implementation. > > > > I added more description in the doc. Please review it again. > > > > > > Best Regards > > Peter Huang > > > > > > > > > > On Thu, Oct 24, 2019 at 2:14 AM Timo Walther wrote: > > > > > Hi Peter, > > > > > > thanks for your proposal. I left some comments in the FLIP document. I > > > agree with Terry that we can have a MVP in Flink 1.10 but should > already > > > discuss the bigger picture as a DDL string cannot be changed easily > once > > > released. > > > > > > In particular we should discuss how resources for function are loaded. > > > If they are simply added to the JobGraph they are available to all > > > functions and could potentially interfere with each other, right? > > > > > > Thanks, > > > Timo > > > > > > > > > > > > On 24.10.19 05:32, Terry Wang wrote: > > > > Hi Peter, > > > > > > > > Sorry late to reply. Thanks for your efforts on this and I just > looked > > > through your design. > > > > I left some comments in the doc about alter function section and > > > function catalog interface. > > > > IMO, the overall design is ok and we can discuss further more about > > some > > > details. > > > > I also think it’s necessary to have this awesome feature limit to > basic > > > function (of course better to have all :) ) in 1.10 release. > > > > > > > > Best, > > > > Terry Wang > > > > > > > > > > > > > > > >> 2019年10月16日 14:19,Peter Huang 写道: >
Re: [Discussion] FLIP-79 Flink Function DDL Support
Hi Jingsong, Thanks for the input. The FLINK function DDL definitely needs to align with HQL, I updated the doc accordingly. CREATE FUNCTION [db_name.]function_name AS class_name [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ]; For you other questions below: 1) how to load resources for function. (How to deal with jar/file/archive) Let about consider the jar from the beginning. For file and archive, I will do more study on the Hive side. The basic idea of loading jar without dependency conflicts is to use separate class loaders for different sessions. I updated doc with the interface change required to achieve the goal. 2) how to pass properties to function. It can be an setProperties function in UDF interface or a constructor with Map with parameters. As Bowen comments on the doc, I think we probably just need to let customers provide such a constructor if they want to use properties in DDL. 3) How does python udf work? It is not in the scope of this FLIP. I think the FLIP 78 will provide the runtime support. Somehow, we just need to bridge the DDL with their runtime interface. But yes, this part needs to be added. But probably in the next phase after the MVP is done. Best Regards Peter Huang On Thu, Oct 24, 2019 at 11:07 PM Jingsong Li wrote: > Hi Peter, > > Thanks for your proposal. The first thing I care about most is whether it > can cover the needs of hive. > Hive create function: > > CREATE FUNCTION [db_name.]function_name AS class_name > [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ]; > > Hive support a list of resources, and support jar/file/archive, Maybe we > need users to tell us exactly what kind of resources are. So we can see > whether to add it to the ClassLoader or other processing? > > +1 for the internal implementation as timo said, like: > - how to load resources for function. (How to deal with jar/file/archive) > - how to pass properties to function. > - How does python udf work? Hive use Transform command to run shell and > python. It would be better if we could make clear how to do. > > Hope to get your reply~ > > Best, > Jingsong Lee > > On Thu, Oct 24, 2019 at 5:14 PM Timo Walther wrote: > > > Hi Peter, > > > > thanks for your proposal. I left some comments in the FLIP document. I > > agree with Terry that we can have a MVP in Flink 1.10 but should already > > discuss the bigger picture as a DDL string cannot be changed easily once > > released. > > > > In particular we should discuss how resources for function are loaded. > > If they are simply added to the JobGraph they are available to all > > functions and could potentially interfere with each other, right? > > > > Thanks, > > Timo > > > > > > > > On 24.10.19 05:32, Terry Wang wrote: > > > Hi Peter, > > > > > > Sorry late to reply. Thanks for your efforts on this and I just looked > > through your design. > > > I left some comments in the doc about alter function section and > > function catalog interface. > > > IMO, the overall design is ok and we can discuss further more about > some > > details. > > > I also think it’s necessary to have this awesome feature limit to basic > > function (of course better to have all :) ) in 1.10 release. > > > > > > Best, > > > Terry Wang > > > > > > > > > > > >> 2019年10月16日 14:19,Peter Huang 写道: > > >> > > >> Hi Xuefu, > > >> > > >> Thank you for the feedback. I think you are pointing out a similar > > concern > > >> with Bowen. Let me describe > > >> how the catalog function and function factory will be changed in the > > >> implementation section. > > >> Then, we can have more discussion in detail. > > >> > > >> > > >> Best Regards > > >> Peter Huang > > >> > > >> On Tue, Oct 15, 2019 at 4:18 PM Xuefu Z wrote: > > >> > > >>> Thanks to Peter for the proposal! > > >>> > > >>> I left some comments in the google doc. Besides what Bowen pointed > > out, I'm > > >>> unclear about how things work end to end from the document. For > > instance, > > >>> SQL DDL-like function definition is mentioned. I guess just having a > > DDL > > >>> for it doesn't explain how it's supported functionally. I think it's > > better > > >>> to have some clarification on what is expected work and what's for > the > > >>> future. > > >>> > > >>> Thanks, > > >>> Xuefu > > >>> > > >>> > > >>> On Tue, Oct 15, 2019 at 11:05 AM Bowen Li > wrote: > > >>> > > Hi Zhenqiu, > > > > Thanks for taking on this effort! > > > > A couple questions: > > - Though this FLIP is about function DDL, can we also think about > how > > the > > created functions can be mapped to CatalogFunction and see if we > need > > to > > modify CatalogFunction interface? Syntax changes need to be backed > by > > the > > backend. > > - Can we define a clearer, smaller scope targeting for Flink 1.10 > > among > > >>> all > > the proposed changes? The current overall scope seems to be quite > > wide, > > >>> and > > it may be
[jira] [Created] (FLINK-14565) Shutdown SystemResourcesCounter on (JM|TM)MetricGroup closed
Zili Chen created FLINK-14565: - Summary: Shutdown SystemResourcesCounter on (JM|TM)MetricGroup closed Key: FLINK-14565 URL: https://issues.apache.org/jira/browse/FLINK-14565 Project: Flink Issue Type: Bug Components: Runtime / Metrics Reporter: Zili Chen Assignee: Zili Chen Currently, we start SystemResourcesCounter when initialize (JM|TM)MetricGroup. This thread doesn't exit on (JM|TM)MetricGroup closed and even there is not exit logic of them. It possibly causes thread leak. For example, on our platform which supports previewing sample SQL execution, it starts a MiniCluster in the same process as the platform. When the preview job finished MiniCluster closed and also (JM|TM)MetricGroup. However these SystemResourcesCounter threads remain. I propose when creating SystemResourcesCounter, track it in (JM|TM)MetricGroup, and on (JM|TM)MetricGroup closed, shutdown SystemResourcesCounter. This way, we survive from thread leaks. CC [~chesnay] [~trohrmann] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[VOTE] Move flink-orc to flink-formats from flink-connectors
Hi all: We already have the parent model of formats. we have put other formats(flink-avro, flink-json, flink-parquet, flink-json, flink-csv, flink-sequence-file) to flink-formats. flink-orc is a format too. So we can move it to flink-formats. In theory, there should be no compatibility problem, only the parent model needs to be changed, and no other changes are needed. I would like to start the vote for it. The vote will be open for at least 72 hours. Discuss thread: [1] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Move-flink-orc-to-flink-formats-from-flink-connectors-td34438.html -- Best, Jingsong Lee
Builds Notification Update
Hi, This is an update and discussion thread to the notification bot and builds mailing list. We created the bui...@flink.apache.org mailing list to monitor the builds status. And a notification bot is running behind it. You can subscribe the mailing list via sending an email to builds-subscr...@flink.apache.org Recently, I updated the bot to only notify for master and release branches. Because I noticed there are notifications for some testing/temporary branches in these days, these notifications are noise so I filtered them. Additionally, you can leave your thoughts about the mailing list and bot in this thread. Best, Jark
Re: [DISCUSS] Move flink-orc to flink-formats from flink-connectors
Thanks jark for reminding me. I will start a vote thread. On Wed, Oct 30, 2019 at 11:40 AM Jark Wu wrote: > Hi Jingsong, > > Do we need a formal vote? Or do we need to wait for "Lazy > Majority" approval [1] which requires 3 binding votes? > > Best, > Jark > > [1]: > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026 > > On Wed, 30 Oct 2019 at 11:12, Jingsong Li wrote: > > > Hi bowen, > > I think the reason is that flink-orc was developed earlier than > > flink-formats. > > > > Thanks the responses of everyone, I will create JIRA to do it. > > > > On Wed, Oct 30, 2019 at 10:27 AM Danny Chan > wrote: > > > > > +1 to move to flink-format. > > > > > > Best, > > > Danny Chan > > > 在 2019年10月29日 +0800 AM11:10,dev@flink.apache.org,写道: > > > > > > > > +1 to move to flink-format. > > > > > > > > > -- > > Best, Jingsong Lee > > > -- Best, Jingsong Lee
Re: [DISCUSS] Move flink-orc to flink-formats from flink-connectors
Hi Jingsong, Do we need a formal vote? Or do we need to wait for "Lazy Majority" approval [1] which requires 3 binding votes? Best, Jark [1]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026 On Wed, 30 Oct 2019 at 11:12, Jingsong Li wrote: > Hi bowen, > I think the reason is that flink-orc was developed earlier than > flink-formats. > > Thanks the responses of everyone, I will create JIRA to do it. > > On Wed, Oct 30, 2019 at 10:27 AM Danny Chan wrote: > > > +1 to move to flink-format. > > > > Best, > > Danny Chan > > 在 2019年10月29日 +0800 AM11:10,dev@flink.apache.org,写道: > > > > > > +1 to move to flink-format. > > > > > -- > Best, Jingsong Lee >
Re: [DISCUSS] Move flink-orc to flink-formats from flink-connectors
Hi bowen, I think the reason is that flink-orc was developed earlier than flink-formats. Thanks the responses of everyone, I will create JIRA to do it. On Wed, Oct 30, 2019 at 10:27 AM Danny Chan wrote: > +1 to move to flink-format. > > Best, > Danny Chan > 在 2019年10月29日 +0800 AM11:10,dev@flink.apache.org,写道: > > > > +1 to move to flink-format. > -- Best, Jingsong Lee
Re: [DISCUSS] Move flink-orc to flink-formats from flink-connectors
+1 to move to flink-format. Best, Danny Chan 在 2019年10月29日 +0800 AM11:10,dev@flink.apache.org,写道: > > +1 to move to flink-format.
Re: [DISCUSS] Move flink-orc to flink-formats from flink-connectors
+1 to move to flink-format. Best, Jark On Wed, 30 Oct 2019 at 02:27, Bowen Li wrote: > Hi Jingsong, > > Thanks for bringing it up. I found flink-orc lives under flink-connectors > instead of flink-formats to be confusing too. > > I wonder why it's that way in the beginning? If there's no good reason to > remain that way, +1 to move flink-orc to flink-formats > > Bowen > > > On Mon, Oct 28, 2019 at 11:26 PM Zhenghua Gao wrote: > > > +1 to move flink-orc to flink-formats. > > Since we have put other file-based formats(flink-avro, flink-json, > > flink-parquet, flink-json, flink-csv, flink-sequence-file) > > in flink-formats, flink-orc should be the same. > > > > *Best Regards,* > > *Zhenghua Gao* > > > > > > On Tue, Oct 29, 2019 at 11:10 AM Jingsong Lee > > wrote: > > > > > Hi devs, > > > > > > I found that Orc is still in connectors, but we already have the parent > > > model of formats. > > > - Is it better to put flink-orc in flink-formats? > > > - Is there a compatibility issue with the move? Looks like only the > > parent > > > model needs to be changed, and no other changes are needed. > > > > > > -- > > > Best, Jingsong Lee > > > > > >
Re: [DISCUSS] How to prepare the Python environment of PyFlink release in the current Flink release process
Hi Dian, Thanks a lot for bringing the discussion. It would be a headache to address these environmental problems, so +1 for option #1 to use the virtual environment. Best, Hequn On Tue, Oct 29, 2019 at 9:31 PM Till Rohrmann wrote: > Thanks for bringing this topic up Dian. I'd be in favour of option #1 > because this would also allow to create reproducible builds. > > Cheers, > Till > > On Tue, Oct 29, 2019 at 5:28 AM jincheng sun > wrote: > > > Hi, > > Thanks for bringing up the discussion Dian. > > +1 for the #1. > > > > Hi Jeff, this changes is for the PyFlink release, i.e.,The release > manager > > should build the release package for Pyflink, and prepare the python > > environment during the building. Since 1.10 we only support python 3.5+, > so > > it will throw an exception if you use python 3.4. > > > > Best, > > Jincheng > > > > > > Jeff Zhang 于2019年10月29日周二 上午11:55写道: > > > > > I am a little confused, why we need to prepare python environment in > > > release. Shouldn't that be done when user start to use pyflink ? > > > Or do you mean to set up python environment for pyflink's CI build ? > > > > > > Regarding this problem "It needs a proper Python environment(i.e. > Python > > > 3.5+, setuptools, etc) to build the PyFlink package" > > > Would the build fail if I use python 3.4 ? > > > > > > > > > Dian Fu 于2019年10月29日周二 上午11:01写道: > > > > > > > Hi all, > > > > > > > > We have reached a consensus that the PyFlink package should be > > published > > > > to PyPI in [1]. Thanks to Jincheng's effort, the PyPI account has > > already > > > > been created and available to use now [2]. It means that we could > > publish > > > > PyFlink to PyPI in the coming releases and it also means that > > additional > > > > steps will be added to the normal process of the Flink release to > > prepare > > > > the PyFlink release package. > > > > > > > > It needs a proper Python environment(i.e. Python 3.5+, setuptools, > etc) > > > to > > > > build the PyFlink package. There are two options in my mind to > prepare > > > the > > > > Python environment: > > > > 1) Reuse the script lint-python.sh defined in flink-python module to > > > > create the required virtual environment and build the PyFlink package > > > using > > > > the created virtual environment. > > > > 2) It's assumed that the local Python environment is properly > installed > > > > and ready to use. The Python environment requirement will be > documented > > > at > > > > the page "Create a Flink Release" and validation check could also be > > > added > > > > in create_binary_release.sh to throw an meaningful error with hints > how > > > to > > > > fix it if it's not correct. > > > > > > > > Option 1: > > > > Pros: > > > > - It's transparent for release managers. > > > > Cons: > > > > - It needs to prepare the virtual environment during preparing the > > > PyFlink > > > > release package and it will take some several minutes as it need to > > > > download a few binaries. > > > > > > > > Option 2: > > > > Pros: > > > > - There is no need to prepare the virtual environment if the local > > > > environment is already properly configured. > > > > Cons: > > > > - It requires the release managers to prepare the local Python > > > environment > > > > and not all the people are familiar with Python and it's a burden for > > > > release managers. > > > > > > > > Personally I prefer to option 1). > > > > > > > > Looking forward to your feedback! > > > > > > > > PS: I think this issue could also be discussed in the JIRA. But I > tend > > to > > > > bring up the discussion to ML as it introduces an additional step to > > the > > > > release process and I think this should be visible to the community > and > > > it > > > > should be well discussed. Besides, we could also get more feedback. > > > > > > > > Regards, > > > > Dian > > > > > > > > [1] > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Publish-the-PyFlink-into-PyPI-tt31201.html > > > > [2] > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-13011?focusedCommentId=16947307=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16947307 > > > > > > > > > > > > -- > > > Best Regards > > > > > > Jeff Zhang > > > > > >
[jira] [Created] (FLINK-14564) all-local slave config doesn't respect $FLINK_SSH_OPTS
Robert Lugg created FLINK-14564: --- Summary: all-local slave config doesn't respect $FLINK_SSH_OPTS Key: FLINK-14564 URL: https://issues.apache.org/jira/browse/FLINK-14564 Project: Flink Issue Type: Bug Components: Runtime / Configuration Affects Versions: 1.9.0 Reporter: Robert Lugg This report is based on a code review. The scenario described is unlikely to happen in the wild. Examining code lines 669 - 686 of ./bin/config.sh reveals, what I believe to be a design bug. Presumably to speed up localhost launching, if ALL slaves are localhost then ssh isn't called but instead taskmanager.sh is called directly. This seems like a bad idea: * That instance will inherit environment variables from the current shell * If a user specifies $FLINK_SSH_OPTS, they will not be honored in the "all local" case. My request is that regardless of mode, taskmanager.sh is launched with the exact same environment. If anyone also happens to be digging through that code, 'readSlaves' could be improved. In addition to checking for 'localhost' and '127.0.0.1', it could also check for `hostname` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14563) reuse HiveShim instance in Hive functions code path rather than creating new ones repeatedly
Bowen Li created FLINK-14563: Summary: reuse HiveShim instance in Hive functions code path rather than creating new ones repeatedly Key: FLINK-14563 URL: https://issues.apache.org/jira/browse/FLINK-14563 Project: Flink Issue Type: Improvement Components: Connectors / Hive Reporter: Bowen Li Assignee: Bowen Li Fix For: 1.10.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [Discussion] FLIP-79 Flink Function DDL Support
Hi all, Besides all the good questions raised above, we seem all agree to have a MVP for Flink 1.10, "to support users to create and persist a java class-based udf that's already in classpath (no extra resource loading), and use it later in queries". IIUIC, to achieve that in 1.10, the following are currently the core issues/blockers we should figure out, and solve them as our **highest priority**: - what's the syntax to distinguish function language (java, scala, python, etc)? we only need to implement the java one in 1.10 but have to settle down the long term solution - how to persist function language in backend catalog? as a k-v pair in properties map, or a dedicate field? - do we really need to allow users set a properties map for udf? what needs to be stored there? what are they used for? - should a catalog impl be able to decide whether it can take a properties map (if we decide to have one), and which language of a udf it can persist? - E.g. Hive metastore, which backs Flink's HiveCatalog, cannot take a properties map and is only able to persist java udf [1], unless we do something hacky to it I feel these questions are essential to Flink functions in the long run, but most importantly, are also the minimum scope for Flink 1.10. Aspects like resource loading security or compatibility with Hive syntax are important too, however if we focus on them now, we may not be able to get the MVP out in time. [1] - https://hive.apache.org/javadocs/r3.1.2/api/org/apache/hadoop/hive/metastore/api/Function.html - https://hive.apache.org/javadocs/r3.1.2/api/org/apache/hadoop/hive/metastore/api/FunctionType.html On Sun, Oct 27, 2019 at 8:22 PM Peter Huang wrote: > Hi Timo, > > Thanks for the feedback. I replied and adjust the design accordingly. For > the concern of class loading. > I think we need to distinguish the function class loading for Temporary and > Permanent function. > > 1) For Permanent function, we can add it to the job graph so that we don't > need to load it multiple times for the different sessions. > 2) For Temporary function, we can register function with a session key, and > use different class loaders in RuntimeContext implementation. > > I added more description in the doc. Please review it again. > > > Best Regards > Peter Huang > > > > > On Thu, Oct 24, 2019 at 2:14 AM Timo Walther wrote: > > > Hi Peter, > > > > thanks for your proposal. I left some comments in the FLIP document. I > > agree with Terry that we can have a MVP in Flink 1.10 but should already > > discuss the bigger picture as a DDL string cannot be changed easily once > > released. > > > > In particular we should discuss how resources for function are loaded. > > If they are simply added to the JobGraph they are available to all > > functions and could potentially interfere with each other, right? > > > > Thanks, > > Timo > > > > > > > > On 24.10.19 05:32, Terry Wang wrote: > > > Hi Peter, > > > > > > Sorry late to reply. Thanks for your efforts on this and I just looked > > through your design. > > > I left some comments in the doc about alter function section and > > function catalog interface. > > > IMO, the overall design is ok and we can discuss further more about > some > > details. > > > I also think it’s necessary to have this awesome feature limit to basic > > function (of course better to have all :) ) in 1.10 release. > > > > > > Best, > > > Terry Wang > > > > > > > > > > > >> 2019年10月16日 14:19,Peter Huang 写道: > > >> > > >> Hi Xuefu, > > >> > > >> Thank you for the feedback. I think you are pointing out a similar > > concern > > >> with Bowen. Let me describe > > >> how the catalog function and function factory will be changed in the > > >> implementation section. > > >> Then, we can have more discussion in detail. > > >> > > >> > > >> Best Regards > > >> Peter Huang > > >> > > >> On Tue, Oct 15, 2019 at 4:18 PM Xuefu Z wrote: > > >> > > >>> Thanks to Peter for the proposal! > > >>> > > >>> I left some comments in the google doc. Besides what Bowen pointed > > out, I'm > > >>> unclear about how things work end to end from the document. For > > instance, > > >>> SQL DDL-like function definition is mentioned. I guess just having a > > DDL > > >>> for it doesn't explain how it's supported functionally. I think it's > > better > > >>> to have some clarification on what is expected work and what's for > the > > >>> future. > > >>> > > >>> Thanks, > > >>> Xuefu > > >>> > > >>> > > >>> On Tue, Oct 15, 2019 at 11:05 AM Bowen Li > wrote: > > >>> > > Hi Zhenqiu, > > > > Thanks for taking on this effort! > > > > A couple questions: > > - Though this FLIP is about function DDL, can we also think about > how > > the > > created functions can be mapped to CatalogFunction and see if we > need > > to > > modify CatalogFunction interface? Syntax changes need to be backed > by > > the > > backend. > > - Can we define a clearer, smaller scope targeting for Flink 1.10
Re: [DISCUSS] Move flink-orc to flink-formats from flink-connectors
Hi Jingsong, Thanks for bringing it up. I found flink-orc lives under flink-connectors instead of flink-formats to be confusing too. I wonder why it's that way in the beginning? If there's no good reason to remain that way, +1 to move flink-orc to flink-formats Bowen On Mon, Oct 28, 2019 at 11:26 PM Zhenghua Gao wrote: > +1 to move flink-orc to flink-formats. > Since we have put other file-based formats(flink-avro, flink-json, > flink-parquet, flink-json, flink-csv, flink-sequence-file) > in flink-formats, flink-orc should be the same. > > *Best Regards,* > *Zhenghua Gao* > > > On Tue, Oct 29, 2019 at 11:10 AM Jingsong Lee > wrote: > > > Hi devs, > > > > I found that Orc is still in connectors, but we already have the parent > > model of formats. > > - Is it better to put flink-orc in flink-formats? > > - Is there a compatibility issue with the move? Looks like only the > parent > > model needs to be changed, and no other changes are needed. > > > > -- > > Best, Jingsong Lee > > >
Re: [VOTE] Accept Stateful Functions into Apache Flink
+1 (non-binding) Seth > On Oct 23, 2019, at 9:31 PM, Jingsong Li wrote: > > +1 (non-binding) > > Best, > Jingsong Lee > >> On Wed, Oct 23, 2019 at 9:02 PM Yu Li wrote: >> >> +1 (non-binding) >> >> Best Regards, >> Yu >> >> >>> On Wed, 23 Oct 2019 at 16:56, Haibo Sun wrote: >>> >>> +1 (non-binding)Best, >>> Haibo >>> >>> >>> At 2019-10-23 09:07:41, "Becket Qin" wrote: +1 (binding) Thanks, Jiangjie (Becket) Qin On Tue, Oct 22, 2019 at 11:44 PM Tzu-Li (Gordon) Tai < >> tzuli...@apache.org wrote: > +1 (binding) > > Gordon > > On Tue, Oct 22, 2019, 10:58 PM Zhijiang .invalid> > wrote: > >> +1 (non-binding) >> >> Best, >> Zhijiang >> >> >> -- >> From:Zhu Zhu >> Send Time:2019 Oct. 22 (Tue.) 16:33 >> To:dev >> Subject:Re: [VOTE] Accept Stateful Functions into Apache Flink >> >> +1 (non-binding) >> >> Thanks, >> Zhu Zhu >> >> Biao Liu 于2019年10月22日周二 上午11:06写道: >> >>> +1 (non-binding) >>> >>> Thanks, >>> Biao /'bɪ.aʊ/ >>> >>> >>> On Tue, 22 Oct 2019 at 10:26, Jark Wu wrote: +1 (non-binding) Best, Jark On Tue, 22 Oct 2019 at 09:38, Hequn Cheng >> >> wrote: > +1 (non-binding) > > Best, Hequn > > On Tue, Oct 22, 2019 at 9:21 AM Dian Fu < >> dian0511...@gmail.com> >> wrote: > >> +1 (non-binding) >> >> Regards, >> Dian >> >>> 在 2019年10月22日,上午9:10,Kurt Young 写道: >>> >>> +1 (binding) >>> >>> Best, >>> Kurt >>> >>> >>> On Tue, Oct 22, 2019 at 12:56 AM Fabian Hueske < >> fhue...@gmail.com> >> wrote: >>> +1 (binding) Am Mo., 21. Okt. 2019 um 16:18 Uhr schrieb Thomas Weise < > t...@apache.org >>> : > +1 (binding) > > > On Mon, Oct 21, 2019 at 7:10 AM Timo Walther < >> twal...@apache.org >> wrote: > >> +1 (binding) >> >> Thanks, >> Timo >> >> >>> On 21.10.19 15:59, Till Rohrmann wrote: >>> +1 (binding) >>> >>> Cheers, >>> Till >>> >>> On Mon, Oct 21, 2019 at 12:13 PM Robert Metzger < > rmetz...@apache.org > >> wrote: >>> +1 (binding) On Mon, Oct 21, 2019 at 12:06 PM Stephan Ewen < >>> se...@apache.org > > wrote: > This is the official vote whether to accept the >>> Stateful > Functions > code > contribution to Apache Flink. > > The current Stateful Functions code, documentation, >>> and >>> website > can > be > found here: > https://statefun.io/ > https://github.com/ververica/stateful-functions > > This vote should capture whether the Apache Flink > community >>> is >> interested > in accepting, maintaining, and evolving Stateful > Functions. > > Reiterating my original motivation, I believe that >>> this >>> project > is a great > match for Apache Flink, because it helps Flink to >> grow > the community into a > new set of use cases. We see current users >> interested >>> in >> such use >> cases, > but they are not well supported by Flink as it >>> currently >> is. > > I also personally commit to put time into making >> sure > this integrates well > with Flink and that we grow contributors and >>> committers > to > maintain >> this > new component well. > > This is a "Adoption of a new Codebase" vote as per >> the >> Flink > bylaws >> [1]. > Only PMC votes are binding. The vote will be open at > least >> 6 days > (excluding weekends), meaning until Tuesday Oct.29th > 12:00 >>> UTC, > or >> until we > achieve the 2/3rd majority. > > Happy voting! > > Best,
Re: [ANNOUNCE] Becket Qin joins the Flink PMC
Congratulations Becket! :-) Xuannan > On Oct 28, 2019, at 6:07 AM, Fabian Hueske wrote: > > Hi everyone, > > I'm happy to announce that Becket Qin has joined the Flink PMC. > Let's congratulate and welcome Becket as a new member of the Flink PMC! > > Cheers, > Fabian
Re: [ANNOUNCE] Becket Qin joins the Flink PMC
Congrats Becket! On Tue, Oct 29, 2019 at 06:32 Till Rohrmann wrote: > Congrats Becket :-) > > On Tue, Oct 29, 2019 at 10:27 AM Yang Wang wrote: > > > Congratulations Becket :) > > > > Best, > > Yang > > > > Vijay Bhaskar 于2019年10月29日周二 下午4:31写道: > > > > > Congratulations Becket > > > > > > Regards > > > Bhaskar > > > > > > On Tue, Oct 29, 2019 at 1:53 PM Danny Chan > wrote: > > > > > > > Congratulations :) > > > > > > > > Best, > > > > Danny Chan > > > > 在 2019年10月29日 +0800 PM4:14,dev@flink.apache.org,写道: > > > > > > > > > > Congratulations :) > > > > > > > > > >
Re: [VOTE] Accept Stateful Functions into Apache Flink
+1 (binding) On Tue, Oct 29, 2019 at 10:09 AM Igal Shilman wrote: > +1 (non-binding) > > Thanks, > Igal Shilman > > On Sat, Oct 26, 2019 at 12:25 AM Ufuk Celebi wrote: > > > +1 (binding) > > > > On Fri, Oct 25, 2019 at 6:39 PM Maximilian Michels > wrote: > > > > > +1 (binding) > > > > > > On 25.10.19 14:31, Congxian Qiu wrote: > > > > +1 (non-biding) > > > > Best, > > > > Congxian > > > > > > > > > > > > Terry Wang 于2019年10月24日周四 上午11:15写道: > > > > > > > >> +1 (non-biding) > > > >> > > > >> Best, > > > >> Terry Wang > > > >> > > > >> > > > >> > > > >>> 2019年10月24日 10:31,Jingsong Li 写道: > > > >>> > > > >>> +1 (non-binding) > > > >>> > > > >>> Best, > > > >>> Jingsong Lee > > > >>> > > > >>> On Wed, Oct 23, 2019 at 9:02 PM Yu Li wrote: > > > >>> > > > +1 (non-binding) > > > > > > Best Regards, > > > Yu > > > > > > > > > On Wed, 23 Oct 2019 at 16:56, Haibo Sun > wrote: > > > > > > > +1 (non-binding)Best, > > > > Haibo > > > > > > > > > > > > At 2019-10-23 09:07:41, "Becket Qin" > wrote: > > > >> +1 (binding) > > > >> > > > >> Thanks, > > > >> > > > >> Jiangjie (Becket) Qin > > > >> > > > >> On Tue, Oct 22, 2019 at 11:44 PM Tzu-Li (Gordon) Tai < > > > tzuli...@apache.org > > > >> > > > >> wrote: > > > >> > > > >>> +1 (binding) > > > >>> > > > >>> Gordon > > > >>> > > > >>> On Tue, Oct 22, 2019, 10:58 PM Zhijiang < > > > wangzhijiang...@aliyun.com > > > >>> .invalid> > > > >>> wrote: > > > >>> > > > +1 (non-binding) > > > > > > Best, > > > Zhijiang > > > > > > > > > > > -- > > > From:Zhu Zhu > > > Send Time:2019 Oct. 22 (Tue.) 16:33 > > > To:dev > > > Subject:Re: [VOTE] Accept Stateful Functions into Apache Flink > > > > > > +1 (non-binding) > > > > > > Thanks, > > > Zhu Zhu > > > > > > Biao Liu 于2019年10月22日周二 上午11:06写道: > > > > > > > +1 (non-binding) > > > > > > > > Thanks, > > > > Biao /'bɪ.aʊ/ > > > > > > > > > > > > > > > > On Tue, 22 Oct 2019 at 10:26, Jark Wu > > wrote: > > > > > > > >> +1 (non-binding) > > > >> > > > >> Best, > > > >> Jark > > > >> > > > >> On Tue, 22 Oct 2019 at 09:38, Hequn Cheng < > > chenghe...@gmail.com > > > > > > > wrote: > > > >> > > > >>> +1 (non-binding) > > > >>> > > > >>> Best, Hequn > > > >>> > > > >>> On Tue, Oct 22, 2019 at 9:21 AM Dian Fu < > > > dian0511...@gmail.com> > > > wrote: > > > >>> > > > +1 (non-binding) > > > > > > Regards, > > > Dian > > > > > > > 在 2019年10月22日,上午9:10,Kurt Young 写道: > > > > > > > > +1 (binding) > > > > > > > > Best, > > > > Kurt > > > > > > > > > > > > On Tue, Oct 22, 2019 at 12:56 AM Fabian Hueske < > > > fhue...@gmail.com> > > > wrote: > > > > > > > >> +1 (binding) > > > >> > > > >> Am Mo., 21. Okt. 2019 um 16:18 Uhr schrieb Thomas Weise > < > > > >>> t...@apache.org > > > > : > > > >> > > > >>> +1 (binding) > > > >>> > > > >>> > > > >>> On Mon, Oct 21, 2019 at 7:10 AM Timo Walther < > > > twal...@apache.org > > > >> > > > wrote: > > > >>> > > > +1 (binding) > > > > > > Thanks, > > > Timo > > > > > > > > > On 21.10.19 15:59, Till Rohrmann wrote: > > > > +1 (binding) > > > > > > > > Cheers, > > > > Till > > > > > > > > On Mon, Oct 21, 2019 at 12:13 PM Robert Metzger < > > > >>> rmetz...@apache.org > > > >>> > > > wrote: > > > > > > > >> +1 (binding) > > > >> > > > >> On Mon, Oct 21, 2019 at 12:06 PM Stephan Ewen < > > > > se...@apache.org > > > >>> > > > >>> wrote: > > > >> > > > >>> This is the official vote whether to accept the > > > > Stateful > > > >>> Functions > > > >>> code > > > >>> contribution to Apache Flink. > > > >>> > > > >>> The current Stateful Functions code, documentation, > > > > and > > > > website > > > >>> can > > >
[jira] [Created] (FLINK-14562) RMQSource leaves idle consumer after closing
Nicolas Deslandes created FLINK-14562: - Summary: RMQSource leaves idle consumer after closing Key: FLINK-14562 URL: https://issues.apache.org/jira/browse/FLINK-14562 Project: Flink Issue Type: Bug Components: Connectors/ RabbitMQ Affects Versions: 1.8.2, 1.8.0, 1.7.2 Reporter: Nicolas Deslandes RabbitMQ connector do not close consumers and channel on closing This potentially leaves idle consumer on the queue that prevent any other consumer on the same queue to get message, this happens the most when a job is stop/cancel and redeploy. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: compile failed
> > Sorry for my terrible mac skill, I found I have many java version, And I > write the JAVA_HOME with absolute path, so even I update to latest jdk > version, I remain failed. > >
[jira] [Created] (FLINK-14561) Don't write FLINK_PLUGINS_DIR ENV variable to Flink configuration
Till Rohrmann created FLINK-14561: - Summary: Don't write FLINK_PLUGINS_DIR ENV variable to Flink configuration Key: FLINK-14561 URL: https://issues.apache.org/jira/browse/FLINK-14561 Project: Flink Issue Type: Bug Components: Runtime / Coordination Affects Versions: 1.9.1, 1.10.0 Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 1.10.0, 1.9.2 With FLINK-12143 we introduced the plugin mechanism. As part of this feature, we now write the {{FLINK_PLUGINS_DIR}} environment variable to the Flink {{Configuration}} we use for the cluster components. This is problematic, because we also use this {{Configuration}} to start new processes (Yarn and Mesos {{TaskExecutors}}). If the {{Configuration}} contains a configured {{FLINK_PLUGINS_DIR}} which differs from the one used by the newly created process, then this leads to problems. In order to solve this problem, I suggest to not write env variables which are intended for local usage within the started process into the {{Configuration}}. Instead we should directly read the environment variable at the required site similar to what we do with the env variable {{FLINK_LIB_DIR}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Build error: package does not exist
And I've tried just `mvn clean install -DskipTests -Drat.skip=true -DskipITs` as well. It takes around half an hour, so I'm not too keen to try all the possibilities. I guess there might be some other SDKs/libraries that I'm missing and Maven won't tell me? Or just some random incompatibility? Thanks for any tips, Hynek po 28. 10. 2019 v 20:24 odesílatel Hynek Noll napsal: > Dear Pritam, > I've tried that as well, specifically I ran: > `mvn clean install -DskipTests -Drat.skip=true -DskipITs -Pinclude-kinesis > -Daws.kinesis-kpl.version=0.12.6` > But the result is still the same. During the build, the packages that I > suppose should be generated by Maven based on Amazon Kinesis are missing. > > Best regards, > Hynek > > po 28. 10. 2019 v 19:57 odesílatel Pritam Sadhukhan < > sadhukhan.pri...@gmail.com> napsal: > >> Hi Hynek, >> >> please run mvn clean install -DskipTests -Drat.skip=true. >> >> It should build properly but takes time. >> >> Regards >> >> On Mon, Oct 28, 2019, 10:06 PM Hynek Noll wrote: >> >> > Hi Bruce and Jark, >> > Thank you for the tip, but I already did the similar by clicking >> "Generate >> > Sources and Update Folders". I tried the suggested command(s), but >> without >> > success unfortunately. >> > Executing `mvn clean install -DskipTests` resulted in an error: "Too >> many >> > files with unapproved license: 2 See RAT report ...". (In the report it >> > states the two files are: >> > flink-core/src/test/resources/abstractID-with-toString-field >> > flink-core/src/test/resources/abstractID-with-toString-field-set >> > ) While `mvn clean package -DskipTests` actually runs for 30+ minutes >> (much >> > longer that the first command but maybe that stops early because of the >> > error) and finishes fine but I have the same problems afterwards. >> > >> > I've tried switching to Maven 3.1.1 (from 3.6.1). >> > >> > Now the one thing that resolved the above stated missing package was >> > switching to Scala 2.11.12 SDK (instead of 2.12)! >> > >> > The steps I've been taking were (within IntelliJ) Invalidate caches & >> > restart (alternatively Exit and delete the .idea folder) -> Open >> IntelliJ >> > again -> Maven Reimport -> Maven Generate Sources and Update Folders -> >> > Build Project. That results in further package(s) missing: >> > >> > >> > *Error:(21, 53) java: package >> org.apache.flink.kinesis.shaded.com.amazonaws >> > does not exist* >> > Maybe it now has to do with just the dependency shading? >> > >> > Best regards, >> > Hynek >> > >> > ne 27. 10. 2019 v 15:02 odesílatel Jark Wu napsal: >> > >> > > Hi Hynek, >> > > >> > > Bruce is right, you should build Flink source code first before >> > developing >> > > by `mvn clean package -DskipTests` in the root directory of Flink. >> > > This may take 10 minutes or more depends on your machine. >> > > >> > > Best, >> > > Jark >> > > >> > > On Sun, 27 Oct 2019 at 20:46, yanjun qiu >> wrote: >> > > >> > > > Hi Hynek, >> > > > I think you should run maven build first, execute mvn clean install >> > > > -DskipTests. Because the Flink SQL parser is used apache calcite >> > > framework >> > > > to generate the sql parser source code. >> > > > >> > > > Regards, >> > > > Bruce >> > > > >> > > > > 在 2019年10月27日,上午12:09,Hynek Noll 写道: >> > > > > >> > > > > package seems to be missing on GitHub: >> > > > >> > > > >> > > >> > >> >
Re: [ANNOUNCE] Becket Qin joins the Flink PMC
Congrats Becket :-) On Tue, Oct 29, 2019 at 10:27 AM Yang Wang wrote: > Congratulations Becket :) > > Best, > Yang > > Vijay Bhaskar 于2019年10月29日周二 下午4:31写道: > > > Congratulations Becket > > > > Regards > > Bhaskar > > > > On Tue, Oct 29, 2019 at 1:53 PM Danny Chan wrote: > > > > > Congratulations :) > > > > > > Best, > > > Danny Chan > > > 在 2019年10月29日 +0800 PM4:14,dev@flink.apache.org,写道: > > > > > > > > Congratulations :) > > > > > >
Re: How to specify a test to run in Flink?
Add `-DfailIfNoTests=false` to your maven command and then it should work. Cheers, Till On Tue, Oct 29, 2019 at 6:51 AM Jark Wu wrote: > Usually, I use the following commands to execute single test and it works > well. > > ``` > $ mvn clean install -DskipTests > # go to the moudle where the test is located > $ cd flink-connectors/flink-hbase > $ mvn test -Dtest=org.apache.flink.addons.hbase.HBaseConnectorITCase > -Dcheckstyle.skip=true > ``` > > Hope it helps! > > Best, > Jark > > > On Tue, 29 Oct 2019 at 11:12, Zhenghua Gao wrote: > > > Actually it's not a Flink problem. > > For single module project, you can run "mvn -Dtest=YOUR_TEST test" to > run a > > single test. > > > > For multiple modules project, you can use "-pl sub-module" to specify the > > module which your test belongs to( mvn -Dtest=YOUR_TEST -pl YOUR_MODULE > > test), > > OR just CD to your module directory and run "mvn -Dtest=YOUR_TEST test" > > > > *Best Regards,* > > *Zhenghua Gao* > > > > > > On Tue, Oct 29, 2019 at 10:19 AM 朱国梁 wrote: > > > > > > > > Hi community! I have a problem that I cannot solve by google. > > > > > > > > > I am trying to specify a test to run using maven. > > > > > > > > > mvn clean test -Dtest=DistributedCacheTest > > > > > > > > > The result says that: > > > [ERROR] Failed to execute goal > > > org.apache.maven.plugins:maven-surefire-plugin:2.19.1:test > (default-test) > > > on project force-shading: No tests were executed! (Set > > > -DfailIfNoTests=false to ignore this error.) -> [Help 1] > > > > > > > > > > > > > > > > > > -- > > > > > > --- > > > Best > > > zgl > > > --- > > > > > > > > > > > > > > >
Re: [DISCUSS] How to prepare the Python environment of PyFlink release in the current Flink release process
Thanks for bringing this topic up Dian. I'd be in favour of option #1 because this would also allow to create reproducible builds. Cheers, Till On Tue, Oct 29, 2019 at 5:28 AM jincheng sun wrote: > Hi, > Thanks for bringing up the discussion Dian. > +1 for the #1. > > Hi Jeff, this changes is for the PyFlink release, i.e.,The release manager > should build the release package for Pyflink, and prepare the python > environment during the building. Since 1.10 we only support python 3.5+, so > it will throw an exception if you use python 3.4. > > Best, > Jincheng > > > Jeff Zhang 于2019年10月29日周二 上午11:55写道: > > > I am a little confused, why we need to prepare python environment in > > release. Shouldn't that be done when user start to use pyflink ? > > Or do you mean to set up python environment for pyflink's CI build ? > > > > Regarding this problem "It needs a proper Python environment(i.e. Python > > 3.5+, setuptools, etc) to build the PyFlink package" > > Would the build fail if I use python 3.4 ? > > > > > > Dian Fu 于2019年10月29日周二 上午11:01写道: > > > > > Hi all, > > > > > > We have reached a consensus that the PyFlink package should be > published > > > to PyPI in [1]. Thanks to Jincheng's effort, the PyPI account has > already > > > been created and available to use now [2]. It means that we could > publish > > > PyFlink to PyPI in the coming releases and it also means that > additional > > > steps will be added to the normal process of the Flink release to > prepare > > > the PyFlink release package. > > > > > > It needs a proper Python environment(i.e. Python 3.5+, setuptools, etc) > > to > > > build the PyFlink package. There are two options in my mind to prepare > > the > > > Python environment: > > > 1) Reuse the script lint-python.sh defined in flink-python module to > > > create the required virtual environment and build the PyFlink package > > using > > > the created virtual environment. > > > 2) It's assumed that the local Python environment is properly installed > > > and ready to use. The Python environment requirement will be documented > > at > > > the page "Create a Flink Release" and validation check could also be > > added > > > in create_binary_release.sh to throw an meaningful error with hints how > > to > > > fix it if it's not correct. > > > > > > Option 1: > > > Pros: > > > - It's transparent for release managers. > > > Cons: > > > - It needs to prepare the virtual environment during preparing the > > PyFlink > > > release package and it will take some several minutes as it need to > > > download a few binaries. > > > > > > Option 2: > > > Pros: > > > - There is no need to prepare the virtual environment if the local > > > environment is already properly configured. > > > Cons: > > > - It requires the release managers to prepare the local Python > > environment > > > and not all the people are familiar with Python and it's a burden for > > > release managers. > > > > > > Personally I prefer to option 1). > > > > > > Looking forward to your feedback! > > > > > > PS: I think this issue could also be discussed in the JIRA. But I tend > to > > > bring up the discussion to ML as it introduces an additional step to > the > > > release process and I think this should be visible to the community and > > it > > > should be well discussed. Besides, we could also get more feedback. > > > > > > Regards, > > > Dian > > > > > > [1] > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Publish-the-PyFlink-into-PyPI-tt31201.html > > > [2] > > > > > > https://issues.apache.org/jira/browse/FLINK-13011?focusedCommentId=16947307=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16947307 > > > > > > > > -- > > Best Regards > > > > Jeff Zhang > > >
[jira] [Created] (FLINK-14560) The value of taskmanager.memory.size in flink-conf.yaml is set to zero will cause taskmanager not to work
fa zheng created FLINK-14560: Summary: The value of taskmanager.memory.size in flink-conf.yaml is set to zero will cause taskmanager not to work Key: FLINK-14560 URL: https://issues.apache.org/jira/browse/FLINK-14560 Project: Flink Issue Type: Bug Components: Deployment / YARN Affects Versions: 1.9.1, 1.9.0 Reporter: fa zheng Fix For: 1.10.0, 1.9.2 If you accidentally set taskmanager.memory.size: 0 in flink-conf.yaml, flink should take a fixed ratio with respect to the size of the task manager JVM. The relateted codes are in TaskManagerServicesConfiguration.fromConfiguration {code:java} //代码占位符 // extract memory settings long configuredMemory; String managedMemorySizeDefaultVal = TaskManagerOptions.MANAGED_MEMORY_SIZE.defaultValue(); if (!configuration.getString(TaskManagerOptions.MANAGED_MEMORY_SIZE).equals(managedMemorySizeDefaultVal)) { try { configuredMemory = MemorySize.parse(configuration.getString(TaskManagerOptions.MANAGED_MEMORY_SIZE), MEGA_BYTES).getMebiBytes(); } catch (IllegalArgumentException e) { throw new IllegalConfigurationException( "Could not read " + TaskManagerOptions.MANAGED_MEMORY_SIZE.key(), e); } } else { configuredMemory = Long.valueOf(managedMemorySizeDefaultVal); }{code} However, in FlinkYarnSessionCli.java, flink will translate the value to byte. {code:java} //代码占位符 // JobManager Memory final int jobManagerMemoryMB = ConfigurationUtils.getJobManagerHeapMemory(configuration).getMebiBytes(); // Task Managers memory final int taskManagerMemoryMB = ConfigurationUtils.getTaskManagerHeapMemory(configuration).getMebiBytes(); {code} As a result, 0 will translate to 0 b and is different from default value. 0 b will cause a error in following check code {code:java} //代码占位符 checkConfigParameter( configuration.getString(TaskManagerOptions.MANAGED_MEMORY_SIZE).equals(TaskManagerOptions.MANAGED_MEMORY_SIZE.defaultValue()) || configuredMemory > 0, configuredMemory, TaskManagerOptions.MANAGED_MEMORY_SIZE.key(), "MemoryManager needs at least one MB of memory. " + "If you leave this config parameter empty, the system automatically " + "pick a fraction of the available memory."); {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [VOTE] FLIP-70: Flink SQL Computed Column Design
+1 Best, Dawid On 29/10/2019 04:03, Zhenghua Gao wrote: > +1 (non-binding) > > *Best Regards,* > *Zhenghua Gao* > > > On Mon, Oct 28, 2019 at 2:26 PM Danny Chan wrote: > >> Hi all, >> >> I would like to start the vote for FLIP-70[1] which is discussed and >> reached consensus in the discussion thread[2]. >> >> The vote will be open for at least 72 hours. I'll try to close it by >> 2019-10-31 18:00 UTC, unless there is an objection or not enough votes. >> >> [1] >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-70%3A+Flink+SQL+Computed+Column+Design >> [2] >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-70-Support-Computed-Column-for-Flink-SQL-td33126.html >> >> Best, >> Danny Chan >> signature.asc Description: OpenPGP digital signature
[jira] [Created] (FLINK-14559) flink sql connector for stream table stdout
xiaodao created FLINK-14559: --- Summary: flink sql connector for stream table stdout Key: FLINK-14559 URL: https://issues.apache.org/jira/browse/FLINK-14559 Project: Flink Issue Type: New Feature Components: Connectors / Common Affects Versions: 1.9.1 Reporter: xiaodao in some cases,we need to output table stream resuls into stdout,just for test or verify sql; out message format like: +--+--+--+ | colName1 | colName2 | colName3| +--+--+--+ | xiaoming | xm123 | 小明 | +--+--+--+ | xiaohong | xh123 | 小红 | +--+--+--+ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14558) Fix the ClassNotFoundException issue for run python job in standalone mode
sunjincheng created FLINK-14558: --- Summary: Fix the ClassNotFoundException issue for run python job in standalone mode Key: FLINK-14558 URL: https://issues.apache.org/jira/browse/FLINK-14558 Project: Flink Issue Type: Improvement Components: API / Python Reporter: sunjincheng Fix For: 1.10.0 java.lang.ClassNotFoundException: org.apache.flink.table.runtime.operators.python.PythonScalarFunctionOperator will be thrown when running a Python UDF job in a standalone cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14557) Clean up the package of py4j
sunjincheng created FLINK-14557: --- Summary: Clean up the package of py4j Key: FLINK-14557 URL: https://issues.apache.org/jira/browse/FLINK-14557 Project: Flink Issue Type: Improvement Components: API / Python Reporter: sunjincheng Fix For: 1.10.0 Currently it contains a directory __MACOSX in the Py4j package. It's useless and should be removed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14556) Correct the package of cloud pickle
sunjincheng created FLINK-14556: --- Summary: Correct the package of cloud pickle Key: FLINK-14556 URL: https://issues.apache.org/jira/browse/FLINK-14556 Project: Flink Issue Type: Improvement Components: API / Python Reporter: sunjincheng Fix For: 1.10.0 Currently the package structure of cloud pickle is as following: {code:java} cloudpickle-1.2.2/ cloudpickle-1.2.2/cloudpickle/ cloudpickle-1.2.2/cloudpickle/__init__.py cloudpickle-1.2.2/cloudpickle/cloudpickle.py cloudpickle-1.2.2/cloudpickle/cloudpickle_fast.py cloudpickle-1.2.2/LICENSE {code} It should be: {code:java} cloudpickle/ cloudpickle/__init__.py cloudpickle/cloudpickle.py cloudpickle/cloudpickle_fast.py cloudpickle/LICENSE {code} Otherwise, the following error will be thrown when running in a standalone cluster :"ImportError: No module named cloudpickle". -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14555) Streaming File Sink s3 end-to-end test stalls
Gary Yao created FLINK-14555: Summary: Streaming File Sink s3 end-to-end test stalls Key: FLINK-14555 URL: https://issues.apache.org/jira/browse/FLINK-14555 Project: Flink Issue Type: Bug Components: Connectors / FileSystem, Tests Affects Versions: 1.10.0 Reporter: Gary Yao [https://api.travis-ci.org/v3/job/603882577/log.txt] {noformat} == Running 'Streaming File Sink s3 end-to-end test' == TEST_DATA_DIR: /home/travis/build/apache/flink/flink-end-to-end-tests/test-scripts/temp-test-directory-36388677539 Flink dist directory: /home/travis/build/apache/flink/flink-dist/target/flink-1.10-SNAPSHOT-bin/flink-1.10-SNAPSHOT Found AWS bucket [secure], running the e2e test. Found AWS access key, running the e2e test. Found AWS secret key, running the e2e test. Executing test with dynamic openSSL linkage (random selection between 'dynamic' and 'static') Setting up SSL with: internal OPENSSL dynamic Using SAN dns:travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866,ip:10.20.1.215,ip:172.17.0.1 Certificate was added to keystore Certificate was added to keystore Certificate reply was installed in keystore MAC verified OK Setting up SSL with: rest OPENSSL dynamic Using SAN dns:travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866,ip:10.20.1.215,ip:172.17.0.1 Certificate was added to keystore Certificate was added to keystore Certificate reply was installed in keystore MAC verified OK Mutual ssl auth: true Use s3 output Starting cluster. Starting standalonesession daemon on host travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. Starting taskexecutor daemon on host travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. Waiting for dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint to come up... Dispatcher REST endpoint is up. [INFO] 1 instance(s) of taskexecutor are already running on travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. Starting taskexecutor daemon on host travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. [INFO] 2 instance(s) of taskexecutor are already running on travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. Starting taskexecutor daemon on host travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. [INFO] 3 instance(s) of taskexecutor are already running on travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. Starting taskexecutor daemon on host travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. Submitting job. Job (c3a9bb7d3f47d63ebccbec5acb1342cb) is running. Waiting for job (c3a9bb7d3f47d63ebccbec5acb1342cb) to have at least 3 completed checkpoints ... Killing TM TaskManager 9227 killed. Starting TM [INFO] 3 instance(s) of taskexecutor are already running on travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. Starting taskexecutor daemon on host travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. Waiting for restart to happen Killing 2 TMs TaskManager 8618 killed. TaskManager 9658 killed. Starting 2 TMs [INFO] 2 instance(s) of taskexecutor are already running on travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. Starting taskexecutor daemon on host travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. [INFO] 3 instance(s) of taskexecutor are already running on travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. Starting taskexecutor daemon on host travis-job-4624d10f-b0e3-45e1-8615-972ff1d44866. Waiting for restart to happen Waiting until all values have been produced Number of produced values 18080/6 No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself. Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received The build has been terminated {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [ANNOUNCE] Becket Qin joins the Flink PMC
Congratulations Becket :) Best, Yang Vijay Bhaskar 于2019年10月29日周二 下午4:31写道: > Congratulations Becket > > Regards > Bhaskar > > On Tue, Oct 29, 2019 at 1:53 PM Danny Chan wrote: > > > Congratulations :) > > > > Best, > > Danny Chan > > 在 2019年10月29日 +0800 PM4:14,dev@flink.apache.org,写道: > > > > > > Congratulations :) > > >
Re: [VOTE] Accept Stateful Functions into Apache Flink
+1 (non-binding) Thanks, Igal Shilman On Sat, Oct 26, 2019 at 12:25 AM Ufuk Celebi wrote: > +1 (binding) > > On Fri, Oct 25, 2019 at 6:39 PM Maximilian Michels wrote: > > > +1 (binding) > > > > On 25.10.19 14:31, Congxian Qiu wrote: > > > +1 (non-biding) > > > Best, > > > Congxian > > > > > > > > > Terry Wang 于2019年10月24日周四 上午11:15写道: > > > > > >> +1 (non-biding) > > >> > > >> Best, > > >> Terry Wang > > >> > > >> > > >> > > >>> 2019年10月24日 10:31,Jingsong Li 写道: > > >>> > > >>> +1 (non-binding) > > >>> > > >>> Best, > > >>> Jingsong Lee > > >>> > > >>> On Wed, Oct 23, 2019 at 9:02 PM Yu Li wrote: > > >>> > > +1 (non-binding) > > > > Best Regards, > > Yu > > > > > > On Wed, 23 Oct 2019 at 16:56, Haibo Sun wrote: > > > > > +1 (non-binding)Best, > > > Haibo > > > > > > > > > At 2019-10-23 09:07:41, "Becket Qin" wrote: > > >> +1 (binding) > > >> > > >> Thanks, > > >> > > >> Jiangjie (Becket) Qin > > >> > > >> On Tue, Oct 22, 2019 at 11:44 PM Tzu-Li (Gordon) Tai < > > tzuli...@apache.org > > >> > > >> wrote: > > >> > > >>> +1 (binding) > > >>> > > >>> Gordon > > >>> > > >>> On Tue, Oct 22, 2019, 10:58 PM Zhijiang < > > wangzhijiang...@aliyun.com > > >>> .invalid> > > >>> wrote: > > >>> > > +1 (non-binding) > > > > Best, > > Zhijiang > > > > > > > -- > > From:Zhu Zhu > > Send Time:2019 Oct. 22 (Tue.) 16:33 > > To:dev > > Subject:Re: [VOTE] Accept Stateful Functions into Apache Flink > > > > +1 (non-binding) > > > > Thanks, > > Zhu Zhu > > > > Biao Liu 于2019年10月22日周二 上午11:06写道: > > > > > +1 (non-binding) > > > > > > Thanks, > > > Biao /'bɪ.aʊ/ > > > > > > > > > > > > On Tue, 22 Oct 2019 at 10:26, Jark Wu > wrote: > > > > > >> +1 (non-binding) > > >> > > >> Best, > > >> Jark > > >> > > >> On Tue, 22 Oct 2019 at 09:38, Hequn Cheng < > chenghe...@gmail.com > > > > > wrote: > > >> > > >>> +1 (non-binding) > > >>> > > >>> Best, Hequn > > >>> > > >>> On Tue, Oct 22, 2019 at 9:21 AM Dian Fu < > > dian0511...@gmail.com> > > wrote: > > >>> > > +1 (non-binding) > > > > Regards, > > Dian > > > > > 在 2019年10月22日,上午9:10,Kurt Young 写道: > > > > > > +1 (binding) > > > > > > Best, > > > Kurt > > > > > > > > > On Tue, Oct 22, 2019 at 12:56 AM Fabian Hueske < > > fhue...@gmail.com> > > wrote: > > > > > >> +1 (binding) > > >> > > >> Am Mo., 21. Okt. 2019 um 16:18 Uhr schrieb Thomas Weise < > > >>> t...@apache.org > > > : > > >> > > >>> +1 (binding) > > >>> > > >>> > > >>> On Mon, Oct 21, 2019 at 7:10 AM Timo Walther < > > twal...@apache.org > > >> > > wrote: > > >>> > > +1 (binding) > > > > Thanks, > > Timo > > > > > > On 21.10.19 15:59, Till Rohrmann wrote: > > > +1 (binding) > > > > > > Cheers, > > > Till > > > > > > On Mon, Oct 21, 2019 at 12:13 PM Robert Metzger < > > >>> rmetz...@apache.org > > >>> > > wrote: > > > > > >> +1 (binding) > > >> > > >> On Mon, Oct 21, 2019 at 12:06 PM Stephan Ewen < > > > se...@apache.org > > >>> > > >>> wrote: > > >> > > >>> This is the official vote whether to accept the > > > Stateful > > >>> Functions > > >>> code > > >>> contribution to Apache Flink. > > >>> > > >>> The current Stateful Functions code, documentation, > > > and > > > website > > >>> can > > >>> be > > >>> found here: > > >>> https://statefun.io/ > > >>> https://github.com/ververica/stateful-functions > > >>> > > >>> This vote should capture whether the Apache Flink > > >>> community > > > is > > interested > > >>> in accepting, maintaining, and evolving Stateful > > >>> Functions. > >
Re: How to use two continuously window with EventTime in sql
Hi, You can use TUMBLE_ROWTIME(...) to get the rowtime attribute of the first window result, and use this field to apply a following window aggregate. See more https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql.html#group-windows Best, Jark On Tue, 29 Oct 2019 at 15:39, 刘建刚 wrote: > For one sql window, I can register table with event time and use > time field in the tumble window. But if I want to use the result for the > first window and use another window to process it, how can I do it? Thank > you. >
Re: [ANNOUNCE] Becket Qin joins the Flink PMC
Congratulations Becket Regards Bhaskar On Tue, Oct 29, 2019 at 1:53 PM Danny Chan wrote: > Congratulations :) > > Best, > Danny Chan > 在 2019年10月29日 +0800 PM4:14,dev@flink.apache.org,写道: > > > > Congratulations :) >
[jira] [Created] (FLINK-14554) Correct the comment of ExistingSavepoint#readKeyedState to generate java doc
vinoyang created FLINK-14554: Summary: Correct the comment of ExistingSavepoint#readKeyedState to generate java doc Key: FLINK-14554 URL: https://issues.apache.org/jira/browse/FLINK-14554 Project: Flink Issue Type: Improvement Components: API / State Processor, Documentation Reporter: vinoyang Current, the comment of ExistingSavepoint#readKeyedState is Irregular which caused it can not generate correct java doc. See here: https://ci.apache.org/projects/flink/flink-docs-release-1.9/api/java/org/apache/flink/state/api/ExistingSavepoint.html#readKeyedState-java.lang.String-org.apache.flink.state.api.functions.KeyedStateReaderFunction- -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [ANNOUNCE] Becket Qin joins the Flink PMC
Congratulations :) Best, Danny Chan 在 2019年10月29日 +0800 PM4:14,dev@flink.apache.org,写道: > > Congratulations :)
[jira] [Created] (FLINK-14553) Respect non-blocking output in StreamTask#processInput
zhijiang created FLINK-14553: Summary: Respect non-blocking output in StreamTask#processInput Key: FLINK-14553 URL: https://issues.apache.org/jira/browse/FLINK-14553 Project: Flink Issue Type: Sub-task Components: Runtime / Task Reporter: zhijiang Assignee: zhijiang Fix For: 1.10.0 The non-blocking output was introduced in FLINK-14396 and FLINK-14498 to solve the problem of handling the checkpoint barrier in the case of backpressure. In order to make the whole process through, {{StreamInputProcessor}} should be allowed to process input elements if the output is also available. The default core size of {{LocalBufferPool}} for {{ResultPartition}} should also be increased by 1 in order not to impact the performance in the new way, and this tiny memory overhead could be ignored in practice. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [ANNOUNCE] Becket Qin joins the Flink PMC
Congratulations :) Piotrek > On 29 Oct 2019, at 08:59, Dawid Wysakowicz wrote: > > Congrats, > > Best > > Dawid > > On 28/10/2019 17:37, Yun Tang wrote: >> Congratulations, Becket >> >> Best >> Yun Tang >> >> From: Rong Rong >> Sent: Tuesday, October 29, 2019 0:19 >> To: dev >> Cc: Becket Qin >> Subject: Re: [ANNOUNCE] Becket Qin joins the Flink PMC >> >> Congratulations Becket!! >> >> -- >> Rong >> >> On Mon, Oct 28, 2019, 7:53 AM Jark Wu wrote: >> >>> Congratulations Becket! >>> >>> Best, >>> Jark >>> >>> On Mon, 28 Oct 2019 at 20:26, Benchao Li wrote: >>> Congratulations Becket. Dian Fu 于2019年10月28日周一 下午7:22写道: > Congrats, Becket. > >> 在 2019年10月28日,下午6:07,Fabian Hueske 写道: >> >> Hi everyone, >> >> I'm happy to announce that Becket Qin has joined the Flink PMC. >> Let's congratulate and welcome Becket as a new member of the Flink >>> PMC! >> Cheers, >> Fabian > -- Benchao Li School of Electronics Engineering and Computer Science, Peking University Tel:+86-15650713730 Email: libenc...@gmail.com; libenc...@pku.edu.cn
Re: [ANNOUNCE] Becket Qin joins the Flink PMC
Congrats, Best Dawid On 28/10/2019 17:37, Yun Tang wrote: > Congratulations, Becket > > Best > Yun Tang > > From: Rong Rong > Sent: Tuesday, October 29, 2019 0:19 > To: dev > Cc: Becket Qin > Subject: Re: [ANNOUNCE] Becket Qin joins the Flink PMC > > Congratulations Becket!! > > -- > Rong > > On Mon, Oct 28, 2019, 7:53 AM Jark Wu wrote: > >> Congratulations Becket! >> >> Best, >> Jark >> >> On Mon, 28 Oct 2019 at 20:26, Benchao Li wrote: >> >>> Congratulations Becket. >>> >>> Dian Fu 于2019年10月28日周一 下午7:22写道: >>> Congrats, Becket. > 在 2019年10月28日,下午6:07,Fabian Hueske 写道: > > Hi everyone, > > I'm happy to announce that Becket Qin has joined the Flink PMC. > Let's congratulate and welcome Becket as a new member of the Flink >> PMC! > Cheers, > Fabian >>> -- >>> >>> Benchao Li >>> School of Electronics Engineering and Computer Science, Peking University >>> Tel:+86-15650713730 >>> Email: libenc...@gmail.com; libenc...@pku.edu.cn >>> signature.asc Description: OpenPGP digital signature
[jira] [Created] (FLINK-14552) Enable partition statistics in blink planner
Jingsong Lee created FLINK-14552: Summary: Enable partition statistics in blink planner Key: FLINK-14552 URL: https://issues.apache.org/jira/browse/FLINK-14552 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Reporter: Jingsong Lee We need update statistics after partition pruning in PushPartitionIntoTableSourceScanRule. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-14551) Unaligned checkpoints
zhijiang created FLINK-14551: Summary: Unaligned checkpoints Key: FLINK-14551 URL: https://issues.apache.org/jira/browse/FLINK-14551 Project: Flink Issue Type: New Feature Components: Runtime / Checkpointing, Runtime / Network Reporter: zhijiang This is the umbrella issue for the feature of unaligned checkpoints. Refer to the [FLIP-76|https://cwiki.apache.org/confluence/display/FLINK/FLIP-76%3A+Unaligned+Checkpoints] for more details. -- This message was sent by Atlassian Jira (v8.3.4#803005)
How to use two continuously window with EventTime in sql
For one sql window, I can register table with event time and use time field in the tumble window. But if I want to use the result for the first window and use another window to process it, how can I do it? Thank you.
Re: [DISCUSS] Move flink-orc to flink-formats from flink-connectors
+1 to move flink-orc to flink-formats. Since we have put other file-based formats(flink-avro, flink-json, flink-parquet, flink-json, flink-csv, flink-sequence-file) in flink-formats, flink-orc should be the same. *Best Regards,* *Zhenghua Gao* On Tue, Oct 29, 2019 at 11:10 AM Jingsong Lee wrote: > Hi devs, > > I found that Orc is still in connectors, but we already have the parent > model of formats. > - Is it better to put flink-orc in flink-formats? > - Is there a compatibility issue with the move? Looks like only the parent > model needs to be changed, and no other changes are needed. > > -- > Best, Jingsong Lee >