Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-09-02 Thread Becket Qin
Hi Timo, I think I might have misunderstood the scope or motivation of the FLIP a little bit. Please let me clarify a little bit. Regarding "great if we don't put this burden on users", we should > consider who is actually using this API. It is not first-level API but > mostly API for Flink

[jira] [Created] (FLINK-13943) Provide api to convert flink table to java List

2019-09-02 Thread Jeff Zhang (Jira)
Jeff Zhang created FLINK-13943: -- Summary: Provide api to convert flink table to java List Key: FLINK-13943 URL: https://issues.apache.org/jira/browse/FLINK-13943 Project: Flink Issue Type:

Re: [SURVEY] Is the default restart delay of 0s causing problems?

2019-09-02 Thread Zhu Zhu
1s looks good to me. And I think the conclusion that when a user should override the delay is worth to be documented. Thanks, Zhu Zhu Steven Wu 于2019年9月3日周二 上午4:42写道: > 1s sounds a good tradeoff to me. > > On Mon, Sep 2, 2019 at 1:30 PM Till Rohrmann wrote: > >> Thanks a lot for all your

Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

2019-09-02 Thread Jark Wu
big +1 to the idea of restructuring the docs. We got a lot of complaints from users about the Table & SQL docs. In general, I think the new structure is very nice. Regarding to moving "User-defined Extensions" to corresponding broader topics, I would prefer current "User-defined Extensions".

Re: [DISCUSS] Add ARM CI build to Flink (information-only)

2019-09-02 Thread Xiyuan Wang
The ARM CI trigger has been changed to `github comment` way only. It means that every PR won't start ARM test unless a comment `check_arm` is added. Like what I did in the PR[1]. A POC for Flink nightly end to end test job is created as well[2]. I'll improve it then. Any feedback or question?

[DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-02 Thread Yijie Shen
Dear Flink Community! I would like to open the discussion of contributing Pulsar Flink connector [0] back to Flink. ## A brief introduction to Apache Pulsar Apache Pulsar[1] is a multi-tenant, high-performance distributed pub-sub messaging system. Pulsar includes multiple features such as

Build failure on flink-python

2019-09-02 Thread Biao Liu
Hi guys, I just found I can't pass the Travis build due to some errors in flink-python module [1]. I'm sure my PR has nothing related with flink-python. And there are also a lot of builds are failing on these errors. I have rebased master branch and tried several times. But it doesn't work.

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-09-02 Thread Dawid Wysakowicz
Hi Timo, Becket From the options that Timo suggested for improving the mutability situation I would prefer option a) as this is the more explicit option and simpler option. Just as a remark, I think in general Configurable types for options will be rather very rare for some special use-cases, as

Re: Build failure on flink-python

2019-09-02 Thread Hequn Cheng
Hi Biao, Thanks a lot for reporting the problem. The fix has been merged into the master just now. You can rebase to the master and try again. Thanks to @Wei Zhong for the fixing. Best, Hequn On Mon, Sep 2, 2019 at 4:41 PM Biao Liu wrote: > There are already some Jira tickets opened for this

Re: [DISCUSS] FLIP-55: Introduction of a Table API Java Expression DSL

2019-09-02 Thread Timo Walther
Hi all, I see a majority votes for `lit(12)` so let's adopt that in the FLIP. The `$("field")` would consider Fabian's concerns so I would vote for keeping it like that. One more question for native English speakers, is it acceptable to have `isEqual` instead of `isEqualTo` and `isGreater`

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

2019-09-02 Thread Dawid Wysakowicz
Hi Gyula, Yes you are right, we were also considering the external configurer. The reason we suggest the built in method is that it is more tightly coupled with the place the options are actually set. Therefore our hope is that, whenever somebody e.g. adds new fields to the ExecutionConfig he/she

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-09-02 Thread Yang Wang
I also agree that all the configuration should be calculated out of TaskManager. So a full configuration should be generated before TaskManager started. Override the calculated configurations through -D now seems better. Best, Yang Xintong Song 于2019年9月2日周一 上午11:39写道: > I just updated the

Re: Build failure on flink-python

2019-09-02 Thread Biao Liu
Hi Hequn, Glad to hear that! Thanks a lot. Thanks, Biao /'bɪ.aʊ/ On Mon, 2 Sep 2019 at 17:28, Hequn Cheng wrote: > Hi Biao, > > Thanks a lot for reporting the problem. The fix has been merged into the > master just now. You can rebase to the master and try again. > > Thanks to @Wei Zhong

Re: Build failure on flink-python

2019-09-02 Thread Biao Liu
There are already some Jira tickets opened for this failure [1] [2]. Sorry I didn't recognize them. 1. https://issues.apache.org/jira/browse/FLINK-13906 2. https://issues.apache.org/jira/browse/FLINK-13932 Thanks, Biao /'bɪ.aʊ/ On Mon, 2 Sep 2019 at 16:24, Biao Liu wrote: > Hi guys, > > I

Re: [DISCUSS] Flink Python User-Defined Function for Table API

2019-09-02 Thread Timo Walther
Hi all, the FLIP looks awesome. However, I would like to discuss the changes to the user-facing parts again. Some feedback: 1. DataViews: With the current non-annotation design for DataViews, we cannot perform eager state declaration, right? At which point during execution do we know which

Re: [DISCUSS] Releasing Flink 1.8.2

2019-09-02 Thread Aljoscha Krettek
I cut a PR for FLINK-13586: https://github.com/apache/flink/pull/9595 > On 2. Sep 2019, at 05:03, Yu Li wrote: > > +1 for a 1.8.2 release, thanks for bringing this up Jincheng! > > Best Regards, > Yu > > > On Mon, 2 Sep 2019 at 09:19, Thomas Weise

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-02 Thread Jan Lukavský
Essentially, the class loader of Flink should be present in parent hierarchy of context class loader. If FlinkUserCodeClassLoader doesn't use context class loader, then it is actually impossible to use a hierarchy like this:  system class loader -> application class loader -> user-defined

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-09-02 Thread Andrey Zagrebin
Hi All, @Xitong thanks a lot for driving the discussion. I also reviewed the FLIP and it looks quite good to me. Here are some comments: - One thing I wanted to discuss is the backwards-compatibility with the previous user setups. We could list which options we plan to deprecate.

Re: How to handle Flink Job with 400MB+ Uberjar with 800+ containers ?

2019-09-02 Thread Yang Wang
Hi Dadashov, Regarding your questions. > Q1 Do all those 800 nodes download of batch of 3 at a time The 800+ containers will be allocated on different yarn nodes. By default, the LocalResourceVisibility is APPLICATION, so they will be downloaded only once and shared for all taskmanager

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-09-02 Thread Timo Walther
Hi Becket, Re 1 & 3: "values in configurations should actually be immutable" I would also prefer immutability but most of our configuration is mutable due to serialization/deserialization. Also maps and list could be mutable in theory. It is difficult to really enforce that for nested

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-02 Thread Aljoscha Krettek
Hi, I actually don’t know whether that change would be ok. FlinkUserCodeClassLoader has taken FlinkUserCodeClassLoader.class.getClassLoader() as the parent ClassLoader before my change. See:

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-02 Thread Aljoscha Krettek
I’m not saying we can’t change that code to use the context class loader. I’m just not sure whether this might break other things. Best, Aljoscha > On 2. Sep 2019, at 11:24, Jan Lukavský wrote: > > Essentially, the class loader of Flink should be present in parent hierarchy > of context

[jira] [Created] (FLINK-13937) Fix the error of the hive connector dependency version

2019-09-02 Thread Jeff Yang (Jira)
Jeff Yang created FLINK-13937: - Summary: Fix the error of the hive connector dependency version Key: FLINK-13937 URL: https://issues.apache.org/jira/browse/FLINK-13937 Project: Flink Issue

[jira] [Created] (FLINK-13938) Use yarn public distributed cache to speed up containers launch

2019-09-02 Thread Yang Wang (Jira)
Yang Wang created FLINK-13938: - Summary: Use yarn public distributed cache to speed up containers launch Key: FLINK-13938 URL: https://issues.apache.org/jira/browse/FLINK-13938 Project: Flink

Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

2019-09-02 Thread vino yang
Agree with Dawid's suggestion about function. Having a Functions section to unify the built-in function and UDF would be better. Dawid Wysakowicz 于2019年8月30日周五 下午7:43写道: > +1 to the idea of restructuring the docs. > > My only suggestion to consider is how about moving the >

Re: [VOTE] FLIP-54: Evolve ConfigOption and Configuration

2019-09-02 Thread vino yang
+1 Dawid Wysakowicz 于2019年8月30日周五 下午7:34写道: > +1 to the design > > On 29/08/2019 15:53, Timo Walther wrote: > > I converted the mentioned Google doc into a wiki page: > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration > > > > > > The

Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

2019-09-02 Thread Kurt Young
+1 to the general idea and thanks for driving this. I think the new structure is more clear than the old one, and i have some suggestions: 1. How about adding a "Architecture & Internals" chapter? This can help developers or users who want to contribute more to have a better understanding about

[jira] [Created] (FLINK-13939) pyflink ExecutionConfigTests#test_equals_and_hash test failed

2019-09-02 Thread vinoyang (Jira)
vinoyang created FLINK-13939: Summary: pyflink ExecutionConfigTests#test_equals_and_hash test failed Key: FLINK-13939 URL: https://issues.apache.org/jira/browse/FLINK-13939 Project: Flink Issue

Kafka Checkpointing weird behavior.

2019-09-02 Thread Dominik Wosiński
Hey, I just want to understand something, because I am observing weird behavior of Kafka Consumer > 0.8 . So the idea is, if we enable the checkpointing and enable the commit offsets on checkpoint, which AFAIK is enabled by default, then for versions of Kafka > 0.8 we should see the changes in

[jira] [Created] (FLINK-13940) S3RecoverableWriter causes job to get stuck in recovery

2019-09-02 Thread Jimmy Weibel Rasmussen (Jira)
Jimmy Weibel Rasmussen created FLINK-13940: -- Summary: S3RecoverableWriter causes job to get stuck in recovery Key: FLINK-13940 URL: https://issues.apache.org/jira/browse/FLINK-13940 Project:

Re: [DISCUSS] Releasing Flink 1.8.2

2019-09-02 Thread Kostas Kloudas
Hi all, I think this should be also considered a blocker https://issues.apache.org/jira/browse/FLINK-13940. It is not a regression but it can result to data loss. I think I can have a quick fix by tomorrow. Cheers, Kostas On Mon, Sep 2, 2019 at 12:01 PM jincheng sun wrote: > > Thanks for all

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-09-02 Thread Timo Walther
@Becket: Regarding "great if we don't put this burden on users", we should consider who is actually using this API. It is not first-level API but mostly API for Flink contributors. Most of the users will use API classes ike ExecutionConfig or TableConfig or other builders for performing

Re: [DISCUSS] Flink Python User-Defined Function for Table API

2019-09-02 Thread jincheng sun
Hi Timo, Great thanks for your feedback. I would like to share my thoughts with you inline. :) Best, Jincheng Timo Walther 于2019年9月2日周一 下午5:04写道: > Hi all, > > the FLIP looks awesome. However, I would like to discuss the changes to > the user-facing parts again. Some feedback: > > 1.

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-09-02 Thread Aljoscha Krettek
Hi, Regarding the factory and duplicate() and whatnot, wouldn’t it work to have a factory like this: /** * Allows to read and write an instance from and to {@link Configuration}. A configurable instance * operates in its own key space in {@link Configuration} and will be (de)prefixed by the

Re: [DISCUSS] Releasing Flink 1.8.2

2019-09-02 Thread jincheng sun
Thanks for all of your feedback! Hi Jark, Glad to see that you are doing what RM should doing. Only one tips here is before the RC1 all the blocker should be fixed, but othrers is nice to have. So you can decide when to prepare RC1 after the blokcer is resolved. Feel free to tell me if you have

Re: Potential block size issue with S3 binary files

2019-09-02 Thread Arvid Heise
Hi Ken, that's indeed a very odd issue that you found. I had a hard time to connect block size with S3 in the beginning and had to dig into the code. I still cannot fully understand why you got two different block size values from the S3 FileSytem. Looking into Hadoop code, I found the following

[jira] [Created] (FLINK-13942) Add Overview page for Getting Started section

2019-09-02 Thread Fabian Hueske (Jira)
Fabian Hueske created FLINK-13942: - Summary: Add Overview page for Getting Started section Key: FLINK-13942 URL: https://issues.apache.org/jira/browse/FLINK-13942 Project: Flink Issue Type:

[jira] [Created] (FLINK-13941) Prevent data-loss by not cleaning up small part files from S3.

2019-09-02 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-13941: -- Summary: Prevent data-loss by not cleaning up small part files from S3. Key: FLINK-13941 URL: https://issues.apache.org/jira/browse/FLINK-13941 Project: Flink

Re: [DISCUSS] FLIP-53: Fine Grained Resource Management

2019-09-02 Thread Zhu Zhu
Thanks Xintong for proposing this improvement. Fine grained resources can be very helpful when user has good planning on resources. I have a few questions: 1. Currently in a batch job, vertices from different regions can run at the same time in slots from the same shared group, as long as they do

Re: [DISCUSS] FLIP-54: Evolve ConfigOption and Configuration

2019-09-02 Thread Becket Qin
Hi Timo and Dawid, Thanks for the patient reply. I agree that both option a) and option b) can solve the mutability problem. For option a), is it a little intrusive to add a duplicate() method for a Configurable? It would be great if we don't put this burden on users if possible. For option b),

Re: [VOTE] FLIP-49: Unified Memory Configuration for TaskExecutors

2019-09-02 Thread Xintong Song
Hi everyone, I'm here to re-start the voting process for FLIP-49 [1], with respect to consensus reached in this thread [2] regarding some new comments and concerns. This voting will be open for at least 72 hours. I'll try to close it Sep. 5, 14:00 UTC, unless there is an objection or not enough

Re: [DISCUSS] Flink Python User-Defined Function for Table API

2019-09-02 Thread jincheng sun
Hi Shaoxuan, Thanks for reminding that. I think "Flink Python User-Defined Function for Table" make sense to me. Best, Jincheng Timo Walther 于2019年9月2日周一 下午5:04写道: > Hi all, > > the FLIP looks awesome. However, I would like to discuss the changes to > the user-facing parts again. Some

Re: [DISCUSS] Simplify Flink's cluster level RestartStrategy configuration

2019-09-02 Thread Till Rohrmann
The link to the dev ML discussion about FLIP-61 is https://lists.apache.org/thread.html/e206390127bcbd9b24d9c41a838faa75157e468e01552ad241e3e24b@%3Cdev.flink.apache.org%3E Cheers, Till On Mon, Sep 2, 2019 at 10:37 PM Till Rohrmann wrote: > Thanks a lot for the positive feedback. I think you

Re: [SURVEY] Is the default restart delay of 0s causing problems?

2019-09-02 Thread Till Rohrmann
Thanks a lot for all your feedback. I see there is a slight tendency towards having a non zero default delay so far. However, Yu has brought up some valid points. Maybe I can shed some light on a). Before FLINK-9158 we set the default delay to 10s because Flink did not support queued scheduling

Re: [DISCUSS] Simplify Flink's cluster level RestartStrategy configuration

2019-09-02 Thread Till Rohrmann
Thanks a lot for the positive feedback. I think you are right Becket that this needs a FLIP since it changes Flink's behaviour. I'll create one and post it to the dev ML. @Zhu Zhu I agree that the restart delay is related to the RestartStrategy configuration. However, I would like to exclude it

Re: [SURVEY] Is the default restart delay of 0s causing problems?

2019-09-02 Thread Steven Wu
1s sounds a good tradeoff to me. On Mon, Sep 2, 2019 at 1:30 PM Till Rohrmann wrote: > Thanks a lot for all your feedback. I see there is a slight tendency > towards having a non zero default delay so far. > > However, Yu has brought up some valid points. Maybe I can shed some light > on a). >

[DISCUSS] FLIP-61 Simplify Flink's cluster level RestartStrategy configuration

2019-09-02 Thread Till Rohrmann
Hi everyone, I'd like to discuss FLIP-61 [1] which tries to simplify Flink's cluster lever RestartStrategy configuration. Currently, Flink's behaviour with respect to configuring the `RestartStrategies` is quite complicated and convoluted. The reason for this is that we evolved the way it has