Re: [DISCUSS] Towards a leaner flink-dist

2019-01-23 Thread Stephan Ewen
There are some points where a leaner approach could help. There are many libraries and connectors that are currently being adding to Flink, which makes the "include all" approach not completely feasible in long run: - Connectors: For a proper experience with the Shell/CLI (for example for SQL)

Applying for Flink contributor permission

2019-01-23 Thread Matthieu Bonneviot
Hi Please provide me contribution permission. email: matthieu.bonnev...@datadome.co apache-username: mbonneviot Thank you

Re: Request for permission

2019-01-23 Thread Fabian Hueske
Hi Liya Fan, Welcome to the Flink community! I gave you contributor permissions for Jira. Best, Fabian Am Mi., 23. Jan. 2019 um 08:01 Uhr schrieb Run : > Hi Guys, > > > I want to contribute to Apache Flink. > Would you please give me the permission as a contributor? > My JIRA ID is fan_li_ya.

Re: Applying for Flink contributor permission

2019-01-23 Thread Robert Metzger
Hey Matthieu, welcome to the Flink community! I've added you as a contributor to our JIRA! Happy coding :) On Wed, Jan 23, 2019 at 9:39 AM Matthieu Bonneviot < matthieu.bonnev...@datadome.co> wrote: > Hi > > Please provide me contribution permission. > email: matthieu.bonnev...@datadome.co >

Re: issue in the MetricReporterRegistry

2019-01-23 Thread Chesnay Schepler
Just to make sure, this issue does not actually affect the behavior, does it? Since we only use these as a filter for reporters to activate. On 21.01.2019 18:22, Matthieu Bonneviot wrote: Hi I don't have the jira permission but If you grant me the permission I could contribute to fix the

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-23 Thread Ufuk Celebi
I like the idea of a leaner binary distribution. At the same time I agree with Jamie that the current binary is quite convenient and connection speeds should not be that big of a deal. Since the binary distribution is one of the first entry points for users, I'd like to keep it as user-friendly as

Re: Applying for Flink contributor permission

2019-01-23 Thread Robert Metzger
Hey Lavkesh, welcome to the Flink community! I've added you as a contributor to our JIRA! Happy coding :) On Wed, Jan 23, 2019 at 9:32 AM Lavkesh Lahngir wrote: > Please provide me controbution permission. > email: lavkes...@gmail.com > apache-username: lavkesh > > Thank you >

Re: [ANNOUNCE] Contributing Alibaba's Blink

2019-01-23 Thread Kurt Young
Thanks @Stephan for this exciting announcement! >From my point of view, i would prefer to use branch. It makes the message "Blink is pat of Flink" more straightforward and clear. Except for the location of blink codes, there are some other questions like what version should should use, and where

[jira] [Created] (FLINK-11412) Remove legacy MesosFlinkResourceManager

2019-01-23 Thread TisonKun (JIRA)
TisonKun created FLINK-11412: Summary: Remove legacy MesosFlinkResourceManager Key: FLINK-11412 URL: https://issues.apache.org/jira/browse/FLINK-11412 Project: Flink Issue Type: Sub-task

Applying for Flink contributor permission

2019-01-23 Thread Lavkesh Lahngir
Please provide me controbution permission. email: lavkes...@gmail.com apache-username: lavkesh Thank you

[jira] [Created] (FLINK-11413) MetricReporter: "metrics.reporters" configuration has to be provided for reporters to be taken into account

2019-01-23 Thread Matthieu Bonneviot (JIRA)
Matthieu Bonneviot created FLINK-11413: -- Summary: MetricReporter: "metrics.reporters" configuration has to be provided for reporters to be taken into account Key: FLINK-11413 URL:

Re: [DISCUSS] Start new Review Process

2019-01-23 Thread Robert Metzger
Hey, as I've mentioned already in the pull request, I have started implementing a little bot for GitHub that tracks the checklist [1] The bot is monitoring incoming pull requests. It creates a comment with the checklist. Reviewers can write a message to the bot (such as "@flinkbot approve

Re: Issues regarding Table-API

2019-01-23 Thread Fabian Hueske
Hi Elias, Q1: Can you post the exception that you receive? Q2: This is already possible today by converting a Table into a DataSet and registering that DataSet again as a Table. Under the hood, the following is happening: As you said, Tables are views (or logical plans). Whenever, a Table is

Re: [DISCUSS] Start new Review Process

2019-01-23 Thread Fabian Hueske
Oh, that's great news! In that case we can just close the PR and start with the bot right away. I think it would be good to extend the PR Review guide [1] with a section about the bot and how to use it. Fabian [1] https://flink.apache.org/reviewing-prs.html Am Mi., 23. Jan. 2019 um 10:03 Uhr

Re: [DISCUSS] A strategy for merging the Blink enhancements

2019-01-23 Thread Till Rohrmann
+1 for Stephan's merge proposal. I think it makes sense to pause the development of the Table API for a short time in order to be able to quickly converge on a common API. >From my experience with the Flip-6 refactoring it can be challenging to catch up with a branch which is actively developed.

Re: [Feature]Returning RuntimeException to REST client while job submission

2019-01-23 Thread Lavkesh Lahngir
Actually, I realized my mistake that JarRunHandler is being used in the jar/run API call. And the changes are done in RestClusterClient. The problem I was facing was that It always gives me "The main method caused an error" without any more details. I am thinking when we throw

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-23 Thread Till Rohrmann
Ufuk's proposal (having a lean default release and a user convenience tarball) sounds good to me. That way advanced users won't be bothered by an unnecessarily large release and new users can benefit from having many useful extensions bundled in one tarball. Cheers, Till On Wed, Jan 23, 2019 at

[jira] [Created] (FLINK-11418) Unable to build docs in Docker image

2019-01-23 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-11418: -- Summary: Unable to build docs in Docker image Key: FLINK-11418 URL: https://issues.apache.org/jira/browse/FLINK-11418 Project: Flink Issue Type: Bug

Re: [Feature]Returning RuntimeException to REST client while job submission

2019-01-23 Thread Chesnay Schepler
I suggest that you first tell me which version you are using so that I can a) reproduce the issue and b) check that this issue wasn't fixed in master or a recent bugfix release. On 23.01.2019 17:16, Lavkesh Lahngir wrote: Actually, I realized my mistake that JarRunHandler is being used in the

Re: Request multiple subpartitions of one partition

2019-01-23 Thread Chris Miller
Hi Zhijang, thank you for your replay. I was playing around a little in the last days and ended up in a solution where I change the ResultPartitionView's subpartitionIndex as soon as it returns an EndOfPartition Event. This way I can, sequentially, receive multiple subpartitions at one single

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-23 Thread Thomas Weise
+1 for trimming the size by default and offering the fat distribution as alternative download On Wed, Jan 23, 2019 at 8:35 AM Till Rohrmann wrote: > Ufuk's proposal (having a lean default release and a user convenience > tarball) sounds good to me. That way advanced users won't be bothered by

[jira] [Created] (FLINK-11419) StreamingFileSink fails to recover after taskmanager failure

2019-01-23 Thread Edward Rojas (JIRA)
Edward Rojas created FLINK-11419: Summary: StreamingFileSink fails to recover after taskmanager failure Key: FLINK-11419 URL: https://issues.apache.org/jira/browse/FLINK-11419 Project: Flink

Re: [ANNOUNCE] Contributing Alibaba's Blink

2019-01-23 Thread Shaoxuan Wang
Thanks Stephan, The entire plan looks good to me. WRT the "Docs for Flink", a subsection should be good enough if we just introduce the outlines of what blink has changed. However, we have made detailed introductions to blink based on the framework of current release document of Flink (those

Re: issue in the MetricReporterRegistry

2019-01-23 Thread Chesnay Schepler
nvm, it does indeed affect behavior :/ On 23.01.2019 10:08, Chesnay Schepler wrote: Just to make sure, this issue does not actually affect the behavior, does it? Since we only use these as a filter for reporters to activate. On 21.01.2019 18:22, Matthieu Bonneviot wrote: Hi I don't have the

[jira] [Created] (FLINK-11415) Introduce JobMasterServiceFactor for JobManagerRunner

2019-01-23 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-11415: - Summary: Introduce JobMasterServiceFactor for JobManagerRunner Key: FLINK-11415 URL: https://issues.apache.org/jira/browse/FLINK-11415 Project: Flink

Re: [ANNOUNCE] Contributing Alibaba's Blink

2019-01-23 Thread Timo Walther
Hi Kurt, I would not make the Blink's documentation visible to users or search engines via a website. Otherwise this would communicate that Blink is an official release. I would suggest to put the Blink docs into `/docs` and people can build it with `./docs/build.sh -pi` if there are

Re: [ANNOUNCE] Contributing Alibaba's Blink

2019-01-23 Thread Timo Walther
As far as I know it, we will not provide any binaries but only the source code. JAR files on Apache servers would need an official voting/release process. Interested users can build Blink themselves using `mvn clean package`. @Stephan: Please correct me if I'm wrong. Regards, Timo Am

[jira] [Created] (FLINK-11417) Make access to ExecutionGraph single threaded from JobMaster main thread

2019-01-23 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-11417: -- Summary: Make access to ExecutionGraph single threaded from JobMaster main thread Key: FLINK-11417 URL: https://issues.apache.org/jira/browse/FLINK-11417

[jira] [Created] (FLINK-11414) Introduce JobMasterService interface for the JobManagerRunner

2019-01-23 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-11414: - Summary: Introduce JobMasterService interface for the JobManagerRunner Key: FLINK-11414 URL: https://issues.apache.org/jira/browse/FLINK-11414 Project: Flink

Re: Applying for Flink contributor permission

2019-01-23 Thread Matthieu Bonneviot
Thanks a lot Le mer. 23 janv. 2019 à 10:06, Robert Metzger a écrit : > Hey Matthieu, > > welcome to the Flink community! > I've added you as a contributor to our JIRA! Happy coding :) > > > > On Wed, Jan 23, 2019 at 9:39 AM Matthieu Bonneviot < > matthieu.bonnev...@datadome.co> wrote: > > > Hi

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-23 Thread Timo Walther
+1 for Stephan's suggestion. For example, SQL connectors have never been part of the main distribution and nobody complained about this so far. I think what is more important than a big dist bundle is a helpful "Downloads" page where users can easily find available filesystems, connectors,

Re: issue in the MetricReporterRegistry

2019-01-23 Thread Matthieu Bonneviot
Yes indeed, it affects the behavior on java 11. I have created a bug in jira about it: Summary: MetricReporter: "metrics.reporters" configuration has to be provided for reporters to be taken into account Key: FLINK-11413 URL:

Re: [ANNOUNCE] Contributing Alibaba's Blink

2019-01-23 Thread Chesnay Schepler
From the ASF side Jar files do notrequire a vote/release process, this is at the discretion of the PMC. However, I have my doubts whether at this time we could even create a source release of Blink given that we'd have to vet the code-base first. Even without source release we could still

?????? Request for permission

2019-01-23 Thread Run
Hi Fabian, Thank you so much for the kind help! Best, Liya Fan -- -- ??: "Fabian Hueske"; : 2019??1??23??(??) 5:06 ??: "dev"; : Re: Request for permission Hi Liya Fan, Welcome to the Flink community! I gave you

Re: [DISCUSS] Start new Review Process

2019-01-23 Thread Robert Metzger
Okay, cool! I'll let you know when the bot is ready in a test repo. While you (and others) are testing it, I'll open a PR for the docs. On Wed, Jan 23, 2019 at 10:15 AM Fabian Hueske wrote: > Oh, that's great news! > In that case we can just close the PR and start with the bot right away. > I

[jira] [Created] (FLINK-11416) DISTINCT on a JOIN inside of an UNION is not working

2019-01-23 Thread Elias Saalmann (JIRA)
Elias Saalmann created FLINK-11416: -- Summary: DISTINCT on a JOIN inside of an UNION is not working Key: FLINK-11416 URL: https://issues.apache.org/jira/browse/FLINK-11416 Project: Flink

Re: [ANNOUNCE] Contributing Alibaba's Blink

2019-01-23 Thread Kurt Young
Hi Timo, What about the jar files, will blink's jar be uploaded to apache repository? If not, i think it will be very inconvenient for users who wants to try blink and view the documents if they need some help from doc. Best, Kurt On Wed, Jan 23, 2019 at 6:09 PM Timo Walther wrote: > Hi

Re: [DISCUSS] Start new Review Process

2019-01-23 Thread Robert Metzger
I have the bot now running in https://github.com/flinkbot/test-repo/pulls Feel free to play with it. On Wed, Jan 23, 2019 at 10:25 AM Robert Metzger wrote: > Okay, cool! I'll let you know when the bot is ready in a test repo. > While you (and others) are testing it, I'll open a PR for the docs.

Re: [Feature]Returning RuntimeException to REST client while job submission

2019-01-23 Thread Chesnay Schepler
Which version are you using? On 23.01.2019 08:00, Lavkesh Lahngir wrote: Or maybe I am missing something? It looks like the JIRA is trying to solve the same issues I stated 樂 In the main method, I just threw a simple new Exception("Some message") and I got the response I mentioned from the rest

Re: [ANNOUNCE] Contributing Alibaba's Blink

2019-01-23 Thread Becket Qin
Really excited to see Blink joining the Flink community! My two cents regarding repo v.s. branch, I am +1 for a branch in Flink. Among many things, what's most important at this point is probably to make Blink code available to the developers so people can discuss the merge strategy. Creating a

Re: [ANNOUNCE] Contributing Alibaba's Blink

2019-01-23 Thread Stephan Ewen
Nice to see this lively discussion. *--- Branch Versus Repository ---* Looks like this is converging towards pushing a branch. How about naming the branch simply "blink-1.5" ? That would be in line with the 1.5 version branch of Flink, which is simply called "release-1.5" ? *--- SGA --- * The

Re: [DISCUSS] A strategy for merging the Blink enhancements

2019-01-23 Thread Stephan Ewen
I think that is a reasonable proposal. Bugs that are identified could be fixed in the blink branch, so that we merge the working code. New feature contributions to that branch would complicate the merge. I would try and rather focus on merging and let new contributions go to the master branch.

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-23 Thread Ufuk Celebi
On Wed, Jan 23, 2019 at 11:01 AM Timo Walther wrote: > I think what is more important than a big dist bundle is a helpful > "Downloads" page where users can easily find available filesystems, > connectors, metric repoters. Not everyone checks Maven central for > available JAR files. I just saw

Re: [ANNOUNCE] Contributing Alibaba's Blink

2019-01-23 Thread Becket Qin
Thanks Stephan, The plan makes sense to me. Regarding the docs, it seems better to have a separate versioned website because there are a lot of changes spread over the places. We can add the banner to remind users that they are looking at the blink docs, which is temporary and will eventually be

Parallel CEP

2019-01-23 Thread dhanuka ranasinghe
Hi All, Is there way to run CEP function parallel. Currently CEP run only sequentially [image: flink-CEP.png] . Cheers, Dhanuka -- Nothing Impossible,Creativity is more important than knowledge.

Re: [Feature]Returning RuntimeException to REST client while job submission

2019-01-23 Thread Lavkesh Lahngir
Hello, I mentioned in the first email. Version: 1.6.2, Commit ID: 3456ad0 On Thu, Jan 24, 2019 at 12:33 AM Chesnay Schepler wrote: > I suggest that you first tell me which version you are using so that I > can a) reproduce the issue and b) check that this issue wasn't fixed in > master or a

[jira] [Created] (FLINK-11422) Prefer testing class to mock StreamTask in AbstractStreamOperatorTestHarness

2019-01-23 Thread TisonKun (JIRA)
TisonKun created FLINK-11422: Summary: Prefer testing class to mock StreamTask in AbstractStreamOperatorTestHarness Key: FLINK-11422 URL: https://issues.apache.org/jira/browse/FLINK-11422 Project: Flink

Re: Flink CEP : Doesn't generate output

2019-01-23 Thread dhanuka ranasinghe
Thank you for the clarification. On Thu, 24 Jan 2019, 12:44 Dian Fu Hi Dhanuka, > > From the code you shared, it seems that you're using event time. The > processing of elements is triggered by watermark in event time and so you > should define how to generate the watermark, i.e with >

Re: Parallel CEP

2019-01-23 Thread Dian Fu
Hi Dhanuka, In order to make the CEP operator to run parallel, the input stream should be KeyedStream. You can refer [1] for detailed information. Regards, Dian [1]: https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns > 在 2019年1月24日,上午10:18,dhanuka

Re: Flink CEP : Doesn't generate output

2019-01-23 Thread Dian Fu
Hi Dhanuka, From the code you shared, it seems that you're using event time. The processing of elements is triggered by watermark in event time and so you should define how to generate the watermark, i.e with DataStream.assignTimestampsAndWatermarks Regards, Dian > 在

[jira] [Created] (FLINK-11420) Serialization of case classes containing a Map[String, Any] sometimes throws ArrayIndexOutOfBounds

2019-01-23 Thread JIRA
Jürgen Kreileder created FLINK-11420: Summary: Serialization of case classes containing a Map[String, Any] sometimes throws ArrayIndexOutOfBounds Key: FLINK-11420 URL:

[jira] [Created] (FLINK-11421) Providing more compilation options for code-generated operators

2019-01-23 Thread Liya Fan (JIRA)
Liya Fan created FLINK-11421: Summary: Providing more compilation options for code-generated operators Key: FLINK-11421 URL: https://issues.apache.org/jira/browse/FLINK-11421 Project: Flink

Re: Parallel CEP

2019-01-23 Thread dhanuka ranasinghe
Hi Dian, I tried that but then kafkaproducer only produce to single partition and only single flink host working while rest not contribute for processing . I will share the code and screenshot Cheers Dhanuka On Thu, 24 Jan 2019, 12:31 Dian Fu Hi Dhanuka, > > In order to make the CEP operator

Re: [Feature]Returning RuntimeException to REST client while job submission

2019-01-23 Thread Lavkesh Lahngir
Hi, It's not fixed in the master. I compiled and ran it yesterday. I am not if that is an issue or design choice. On Thu, Jan 24, 2019 at 11:38 AM Lavkesh Lahngir wrote: > Hello, > I mentioned in the first email. > > Version: 1.6.2, Commit ID: 3456ad0 > > On Thu, Jan 24, 2019 at 12:33 AM

Side Outputs for late arriving records

2019-01-23 Thread Ramya Ramamurthy
Hi, I have a query with regard to Late arriving records. We are using Flink 1.7 with Kafka Consumers of version 2.11.0.11. In my sink operators, which converts this table to a stream which is being pushed to Elastic Search, I am able to see this metric " *numLateRecordsDropped*". My Kafka

Re: Parallel CEP

2019-01-23 Thread Dian Fu
I'm afraid you cannot do that. The inputs having the same key should be processed by the same CEP operator. Otherwise the results will be nondeterministic and also be wrong. Regards, Dian > 在 2019年1月24日,下午2:56,dhanuka ranasinghe 写道: > > In this example key will be same. I am using 1 million

[jira] [Created] (FLINK-11423) Propagate the error message from Main method to JarRunHandler

2019-01-23 Thread Lavkesh Lahngir (JIRA)
Lavkesh Lahngir created FLINK-11423: --- Summary: Propagate the error message from Main method to JarRunHandler Key: FLINK-11423 URL: https://issues.apache.org/jira/browse/FLINK-11423 Project: Flink

Re: Parallel CEP

2019-01-23 Thread Dian Fu
Whether using KeyedStream depends on the logic of your job, i.e, whether you are looking for patterns for some partitions, i.e, patterns for a particular user. If so, you should partition the input data before the CEP operator. Otherwise, the input data should not be partitioned. Regards, Dian

Re: Parallel CEP

2019-01-23 Thread Dian Fu
Hi Dhanuka, Does the KeySelector of Event::getTriggerID generate the same key for all the inputs or only generate very few key values and these key values happen to be hashed to the same downstream operator? You can print the results of Event::getTriggerID to check if it's that case. Regards,

Re: Parallel CEP

2019-01-23 Thread dhanuka ranasinghe
In this example key will be same. I am using 1 million messages with same key for performance testing. But still I want to process them parallel. Can't I use Split function and get a SplitStream for that purpose? On Thu, Jan 24, 2019 at 2:49 PM Dian Fu wrote: > Hi Dhanuka, > > Does the