Re: [DISCUSS] Reducing build times

2019-08-16 Thread Aljoscha Krettek
Speaking of flink-shaded, do we have any idea what the impact of shading is on the build time? We could get rid of shading completely in the Flink main repository by moving everything that we shade to flink-shaded. Aljoscha > On 16. Aug 2019, at 14:58, Bowen Li wrote: > > +1 to Till's points

Re: [DISCUSS] Reducing build times

2019-08-16 Thread Chesnay Schepler
@Aljoscha Shading takes a few minutes for a full build; you can see this quite easily by looking at the compile step in the misc profile ; all modules that longer than a fraction of a section are usually caused by shading lots of classes.

Re: [DISCUSS] FLIP-53: Fine Grained Resource Management

2019-08-16 Thread Till Rohrmann
Hi Xintong, thanks for drafting this FLIP. I think your proposal helps to improve the execution of batch jobs more efficiently. Moreover, it enables the proper integration of the Blink planner which is very important as well. Overall, the FLIP looks good to me. I was wondering whether it

Re: [VOTE] Flink Project Bylaws

2019-08-16 Thread Chesnay Schepler
+1 (binding) Although I think it would be a good idea to always cc priv...@flink.apache.org when modifying bylaws, if anything to speed up the voting process. On 16/08/2019 11:26, Ufuk Celebi wrote: +1 (binding) – Ufuk On Wed, Aug 14, 2019 at 4:50 AM Biao Liu wrote: +1 (non-binding)

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-08-16 Thread Till Rohrmann
Thanks for the clarification Xintong. I understand the two alternatives now. I would be in favour of option 2 because it makes things explicit. If we don't limit the direct memory, I fear that we might end up in a similar situation as we are currently in: The user might see that her process gets

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-08-16 Thread Xintong Song
Thanks for sharing your opinion Till. I'm also in favor of alternative 2. I was wondering whether we can avoid using Unsafe.allocate() for off-heap managed memory and network memory with alternative 3. But after giving it a second thought, I think even for alternative 3 using direct memory for

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-08-16 Thread Jark Wu
Thanks for starting this design Timo and Dawid, Improving ConfigOption has been hovering in my mind for a long time. We have seen the benefit when developing blink configurations and connector properties in 1.9 release. Thanks for bringing it up and make such a detailed design. I will leave my

[VOTE] FLIP-50: Spill-able Heap State Backend

2019-08-16 Thread Yu Li
Hi All, Since we have reached a consensus in the discussion thread [1], I'd like to start the voting for FLIP-50 [2]. This vote will be open for at least 72 hours. Unless objection I will try to close it by end of Tuesday August 20, 2019 if we have sufficient votes. Thanks. [1]

Re: [VOTE] Flink Project Bylaws

2019-08-16 Thread Becket Qin
Hi Chesnay, Thanks for responding. I think cc private@ is a good idea. I just added that to the CC list. We are following the 2/3 majority voting scheme defined in the bylaws here. I should have referred to the terms in the bylaws instead rephrasing them. Thanks, Jiangjie (Becket) Qin On

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-08-16 Thread Zili Chen
Hi Timo, It looks interesting. Thanks for preparing this FLIP! Client API enhancement benefit from this evolution which hopefully provides a better view of configuration of Flink. In client API enhancement, we likely make the deployment of cluster and submission of job totally defined by

Re: Inverted classloading for client

2019-08-16 Thread Paul Lam
Hi, I’ve created a ticket to track this problem [1]. Any comments will be appreciated. [1] https://issues.apache.org/jira/browse/FLINK-13749 Best, Paul Lam > 在 2019年8月9日,11:16,Paul Lam 写道: > > Hi devs, > > Flink uses inverted class

Re: [DISCUSS] Reducing build times

2019-08-16 Thread Bowen Li
+1 to Till's points on #2 and #5, especially the potential non-disruptive, gradual migration approach if we decide to go that route. To add on, I want to point it out that we can actually start with flink-shaded project [1] which is a perfect candidate for PoC. It's of much smaller size, totally

Re: [VOTE] FLIP-51: Rework of the Expression Design

2019-08-16 Thread Aljoscha Krettek
+1 This seems to be a good refactoring/cleanup step to me! > On 16. Aug 2019, at 10:59, Dawid Wysakowicz wrote: > > +1 from my side > > Best, > > Dawid > > On 16/08/2019 10:31, Jark Wu wrote: >> +1 from my side. >> >> Thanks Jingsong for driving this. >> >> Best, >> Jark >> >> On Thu, 15

Re: [DISCUSS] Flink client api enhancement for downstream project

2019-08-16 Thread Aljoscha Krettek
Hi, I read both Jeffs initial design document and the newer document by Tison. I also finally found the time to collect our thoughts on the issue, I had quite some discussions with Kostas and this is the result: [1]. I think overall we agree that this part of the code is in dire need of some

Re: [VOTE] Flink Project Bylaws

2019-08-16 Thread Chesnay Schepler
The wording of the original mail is ambiguous imo. "The vote requires 2/3 majority of the binding +1s to pass." This to me reads very much "This vote passes if 2/3 of all votes after the voting period are +1." Maybe it's just a wording thing, but it was not clear to me that this follows the

Re: [DISCUSS] Reducing build times

2019-08-16 Thread Chesnay Schepler
Update: TL;DR: table-planner is a good candidate for enabling fork reuse right away, while flink-tests has the potential for huge savings, but we have to figure out some issues first. Build link: https://travis-ci.org/zentol/flink/builds/572659220 4/8 profiles failed. No speedup in

[jira] [Created] (FLINK-13751) Add Built-in vector types

2019-08-16 Thread Xu Yang (JIRA)
Xu Yang created FLINK-13751: --- Summary: Add Built-in vector types Key: FLINK-13751 URL: https://issues.apache.org/jira/browse/FLINK-13751 Project: Flink Issue Type: Sub-task Components:

Re: [DISCUSS] Update our Roadmap

2019-08-16 Thread Robert Metzger
Flink 1.9 is feature freezed and almost released. I guess it makes sense to update the roadmap on the website again. Who feels like having a good overview of what's coming up? On Tue, May 7, 2019 at 4:33 PM Fabian Hueske wrote: > Yes, that's a very good proposal Jark. > +1 > > Best, Fabian > >

[DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-08-16 Thread Timo Walther
Hi everyone, Dawid and I are working on making parts of ExecutionConfig and TableConfig configurable via config options. This is necessary to make all properties also available in SQL. Additionally, with the new SQL DDL based on properties as well as more connectors and formats coming up,

Re: [DISCUSS] Reducing build times

2019-08-16 Thread Till Rohrmann
For the sake of keeping the discussion focused and not cluttering the discussion thread I would suggest to split the detailed reporting for reusing JVMs to a separate thread and cross linking it from here. Cheers, Till On Fri, Aug 16, 2019 at 1:36 PM Chesnay Schepler wrote: > Update: > >

Re: [DISCUSS] FLIP-53: Fine Grained Resource Management

2019-08-16 Thread Xintong Song
Thanks for the feedbacks, Yangze and Till. Yangze, I agree with you that we should make scheduling strategy pluggable and optimize the strategy to reduce the memory fragmentation problem, and thanks for the inputs on the potential algorithmic solutions. However, I'm in favor of keep this FLIP

Re: [VOTE] Flink Project Bylaws

2019-08-16 Thread Chesnay Schepler
I'm very late to the party, but isn't it a bit weird that we're using a voting scheme that isn't laid out in the bylaws? Additionally, I would heavily suggest to CC priv...@flink.apache.org, as we want as many PMC as possible to look at this. (I would regard the this point as a reason for

Re: [VOTE] Flink Project Bylaws

2019-08-16 Thread Dawid Wysakowicz
AFAIK this voting scheme is described in the "Modifying Bylaws" section, in the end introducing bylaws is a modify operation ;) . I think it is a valid point to CC priv...@flink.apache.org in the future. I wouldn't say it is a must though. The voting scheme requires that every PMC member has to be

[jira] [Created] (FLINK-13752) TaskDeploymentDescriptor cannot be recycled by GC due to referenced by an anonymous function

2019-08-16 Thread Yun Gao (JIRA)
Yun Gao created FLINK-13752: --- Summary: TaskDeploymentDescriptor cannot be recycled by GC due to referenced by an anonymous function Key: FLINK-13752 URL: https://issues.apache.org/jira/browse/FLINK-13752

[jira] [Created] (FLINK-13754) Decouple OperatorChain from StreamStatusMaintainer

2019-08-16 Thread zhijiang (JIRA)
zhijiang created FLINK-13754: Summary: Decouple OperatorChain from StreamStatusMaintainer Key: FLINK-13754 URL: https://issues.apache.org/jira/browse/FLINK-13754 Project: Flink Issue Type:

[jira] [Created] (FLINK-13753) Integrate new Source Operator with Mailbox Model in StreamTask

2019-08-16 Thread zhijiang (JIRA)
zhijiang created FLINK-13753: Summary: Integrate new Source Operator with Mailbox Model in StreamTask Key: FLINK-13753 URL: https://issues.apache.org/jira/browse/FLINK-13753 Project: Flink

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-08-16 Thread JingsongLee
+1 to this, thanks Timo and Dawid for the design. This allows the currently cluttered configuration of various modules to be unified. This is also first step of one of the keys to making new unified TableEnvironment available for production. Previously, we did encounter complex configurations,

[jira] [Created] (FLINK-13744) Improper error message when submit flink job in yarn-cluster mode without hadoop lib bundled

2019-08-16 Thread Jeff Zhang (JIRA)
Jeff Zhang created FLINK-13744: -- Summary: Improper error message when submit flink job in yarn-cluster mode without hadoop lib bundled Key: FLINK-13744 URL: https://issues.apache.org/jira/browse/FLINK-13744

Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

2019-08-16 Thread Till Rohrmann
+1 for this FLIP and the feature. I think this feature will be super helpful for many Flink users. Once the SpillableHeapKeyedStateBackend has proven to be superior to the HeapKeyedStateBackend we should think about removing the latter completely to reduce maintenance burden. Cheers, Till On

[jira] [Created] (FLINK-13747) Remove some TODOs in Hive connector

2019-08-16 Thread Rui Li (JIRA)
Rui Li created FLINK-13747: -- Summary: Remove some TODOs in Hive connector Key: FLINK-13747 URL: https://issues.apache.org/jira/browse/FLINK-13747 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-13746) Elasticsearch (v2.3.5) sink end-to-end test fails on Travis

2019-08-16 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-13746: - Summary: Elasticsearch (v2.3.5) sink end-to-end test fails on Travis Key: FLINK-13746 URL: https://issues.apache.org/jira/browse/FLINK-13746 Project: Flink

Re: flink 1.9 DDL nested json derived

2019-08-16 Thread Danny Chan
Hi, Shengnan YU ~ You can reference the test cases in FlinkDDLDataTypeTest[1] for a quick reference of what a DDL column type looks like. [1] 

[jira] [Created] (FLINK-13743) Port PythonTableUtils to flink-python module

2019-08-16 Thread Dian Fu (JIRA)
Dian Fu created FLINK-13743: --- Summary: Port PythonTableUtils to flink-python module Key: FLINK-13743 URL: https://issues.apache.org/jira/browse/FLINK-13743 Project: Flink Issue Type: Task

[jira] [Created] (FLINK-13748) Streaming File Sink s3 end-to-end test failed on Travis

2019-08-16 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-13748: - Summary: Streaming File Sink s3 end-to-end test failed on Travis Key: FLINK-13748 URL: https://issues.apache.org/jira/browse/FLINK-13748 Project: Flink

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-16 Thread Yu Li
+1 (non-binding) - checked release notes: OK - checked sums and signatures: OK - source release - contains no binaries: OK - contains no 1.9-SNAPSHOT references: OK - build from source: OK (8u102) - mvn clean verify: OK (8u102) - binary release - no examples appear to be

[jira] [Created] (FLINK-13745) Flink cache on Travis does not exist

2019-08-16 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-13745: - Summary: Flink cache on Travis does not exist Key: FLINK-13745 URL: https://issues.apache.org/jira/browse/FLINK-13745 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-13756) Modify Code Annotations for findAndCreateTableSource in TableFactoryUtil

2019-08-16 Thread hehuiyuan (JIRA)
hehuiyuan created FLINK-13756: - Summary: Modify Code Annotations for findAndCreateTableSource in TableFactoryUtil Key: FLINK-13756 URL: https://issues.apache.org/jira/browse/FLINK-13756 Project: Flink

Re: [VOTE] Flink Project Bylaws

2019-08-16 Thread Shaoxuan Wang
+1 (binding) On Fri, Aug 16, 2019 at 7:48 PM Chesnay Schepler wrote: > +1 (binding) > > Although I think it would be a good idea to always cc > priv...@flink.apache.org when modifying bylaws, if anything to speed up > the voting process. > > On 16/08/2019 11:26, Ufuk Celebi wrote: > > +1

[jira] [Created] (FLINK-13757) Document error for `logical functions`

2019-08-16 Thread hehuiyuan (JIRA)
hehuiyuan created FLINK-13757: - Summary: Document error for `logical functions` Key: FLINK-13757 URL: https://issues.apache.org/jira/browse/FLINK-13757 Project: Flink Issue Type: Wish

[jira] [Created] (FLINK-13755) support Hive built-in functions in Flink

2019-08-16 Thread Bowen Li (JIRA)
Bowen Li created FLINK-13755: Summary: support Hive built-in functions in Flink Key: FLINK-13755 URL: https://issues.apache.org/jira/browse/FLINK-13755 Project: Flink Issue Type: New Feature

[jira] [Created] (FLINK-13749) Make Flink client respect classloading policy

2019-08-16 Thread Paul Lin (JIRA)
Paul Lin created FLINK-13749: Summary: Make Flink client respect classloading policy Key: FLINK-13749 URL: https://issues.apache.org/jira/browse/FLINK-13749 Project: Flink Issue Type:

[jira] [Created] (FLINK-13750) Separate HA services between client-/ and server-side

2019-08-16 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-13750: Summary: Separate HA services between client-/ and server-side Key: FLINK-13750 URL: https://issues.apache.org/jira/browse/FLINK-13750 Project: Flink

Re: [DISCUSS] Reducing build times

2019-08-16 Thread Arvid Heise
Thank you for starting the discussion as well! +1 to 1. it seems to be a quite low-hanging fruit that we should try to employ as much as possible. -0 to 2. the build setup is already very complicated. Adding new functionality that I would expect to come out of the box of a modern build tool

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-16 Thread Guowei Ma
Hi, -1 We have a benchmark job, which includes a two-input operator. This job has a big performance regression using 1.9 compared to 1.8. It's still not very clear why this regression happens. Best, Guowei Yu Li 于2019年8月16日周五 下午3:27写道: > +1 (non-binding) > > - checked release notes: OK > -

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-08-16 Thread Till Rohrmann
I guess you have to help me understand the difference between alternative 2 and 3 wrt to memory under utilization Xintong. - Alternative 2: set XX:MaxDirectMemorySize to Task Off-Heap Memory and JVM Overhead. Then there is the risk that this size is too low resulting in a lot of garbage

Re: [VOTE] FLIP-51: Rework of the Expression Design

2019-08-16 Thread Jark Wu
+1 from my side. Thanks Jingsong for driving this. Best, Jark On Thu, 15 Aug 2019 at 22:09, Timo Walther wrote: > +1 for this. > > Thanks, > Timo > > Am 15.08.19 um 15:57 schrieb JingsongLee: > > Hi Flink devs, > > > > I would like to start the voting for FLIP-51 Rework of the Expression > >

Re: [DISCUSS] Reducing build times

2019-08-16 Thread Chesnay Schepler
There appears to be a general agreement that 1) should be looked into; I've setup a branch with fork reuse being enabled for all tests; will report back the results. On 15/08/2019 09:38, Chesnay Schepler wrote: Hello everyone, improving our build times is a hot topic at the moment so let's

Re: [VOTE] FLIP-51: Rework of the Expression Design

2019-08-16 Thread Dawid Wysakowicz
+1 from my side Best, Dawid On 16/08/2019 10:31, Jark Wu wrote: > +1 from my side. > > Thanks Jingsong for driving this. > > Best, > Jark > > On Thu, 15 Aug 2019 at 22:09, Timo Walther wrote: > >> +1 for this. >> >> Thanks, >> Timo >> >> Am 15.08.19 um 15:57 schrieb JingsongLee: >>> Hi Flink

Re: [DISCUSS] FLIP-53: Fine Grained Resource Management

2019-08-16 Thread Yangze Guo
Hi, Xintong Thanks to propose this FLIP. The general design looks good to me, +1 for this feature. Since slots in the same task executor could have different resource profile, we will meet resource fragment problem. Think about this case: - request A want 1G memory while request B & C want 0.5G

[DISCUSS] Release flink-shaded 8.0

2019-08-16 Thread Chesnay Schepler
Hello, I would like to kick off the next flink-shaded release next week. There are 2 ongoing efforts that are blocked on this release: * [FLINK-13467] Java 11 support requires a bump to ASM to correctly handle Java 11 bytecode * [FLINK-11767] Reworking the

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-16 Thread Till Rohrmann
Thanks for reporting this issue Guowei. Could you share a bit more details what the job exactly does and which operators it uses? Does the job uses the new `TwoInputSelectableStreamTask` which might cause the performance regression? I think it is important to understand where the problem comes

Re: [DISCUSS] Reducing build times

2019-08-16 Thread Xiyuan Wang
6. CI service I'm not very familar with tarvis, but according to its offical doc[1][2]. Is it possible to run jobs in parallel? AFAIK, many CI system supports this kind of feature. [1]: https://docs.travis-ci.com/user/speeding-up-the-build/#parallelizing-your-builds-across-virtual-machines

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-16 Thread Guowei Ma
Hi, till I can send the job to you offline. It is just a datastream job and does not use TwoInputSelectableStreamTask. A->B \ C / D->E Best, Guowei Till Rohrmann 于2019年8月16日周五 下午4:34写道: > Thanks for reporting this issue Guowei. Could you share a bit more details >

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-08-16 Thread Xintong Song
Let me explain this with a concrete example Till. Let's say we have the following scenario. Total Process Memory: 1GB JVM Direct Memory (Task Off-Heap Memory + JVM Overhead): 200MB Other Memory (JVM Heap Memory, JVM Metaspace, Off-Heap Managed Memory and Network Memory): 800MB For alternative

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-16 Thread Terry Wang
Congratulations Andrey! Best, Terry Wang > 在 2019年8月15日,下午9:27,Hequn Cheng 写道: > > Congratulations Andrey! > > On Thu, Aug 15, 2019 at 3:30 PM Fabian Hueske > wrote: > Congrats Andrey! > > Am Do., 15. Aug. 2019 um 07:58 Uhr schrieb Gary Yao

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-16 Thread Gyula Fóra
Hi all, I agree with Till that we should investigate the suspected performance regression issue before proceeding with the release. If we do not find any problem I vote +1 I have verified the following behaviour: - Built flink with custom hadoop version - YARN Deployment with and without

Re: [VOTE] Flink Project Bylaws

2019-08-16 Thread Ufuk Celebi
+1 (binding) – Ufuk On Wed, Aug 14, 2019 at 4:50 AM Biao Liu wrote: > +1 (non-binding) > > Thanks for pushing this! > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Wed, 14 Aug 2019 at 09:37, Jark Wu wrote: > > > +1 (non-binding) > > > > Best, > > Jark > > > > On Wed, 14 Aug 2019 at 09:22, Kurt Young