Re: Apache Flink and Kudu integration

2016-11-10 Thread Fabian Hueske
Hi Ruben, that sounds great! In case you are planning to contribute your connector, you should have a look at Apache Bahir [1]. Bahir is a project that collects connectors and other extensions of distributed analytics platforms (currently Flink and Spark). As of now, it offers Flink connectors to

Re: Apache Flink and Kudu integration

2016-11-10 Thread Márton Balassi
Hi Ruben, Thanks. Let us know how you progress. Funny enough today I am playing with the Spark Kudu connector, I would be interested in checking out your code as soon as you have something tangible. Best, Marton On Thu, Nov 10, 2016 at 10:19 AM, Fabian Hueske wrote: > Hi Ruben, > > that sound

[FLINK-4541] Support for SQL NOT IN operator

2016-11-10 Thread Alexander Shoshin
Hi, I am working on FLINK-4541 issue and this is my current changes: https://github.com/apache/flink/compare/master...AlexanderShoshin:FLINK-4541. I found that NOT IN does not work with nested queries because of missed DataSet planner rule for a cross join. After adding DataSetCrossJoinRule sev

[jira] [Created] (FLINK-5048) Kafka Consumer (0.9/0.10) threading model leads problematic cancellation behavior

2016-11-10 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-5048: --- Summary: Kafka Consumer (0.9/0.10) threading model leads problematic cancellation behavior Key: FLINK-5048 URL: https://issues.apache.org/jira/browse/FLINK-5048 Project

[DISCUSS] Adding a dispose() method in the RichFunction.

2016-11-10 Thread Kostas Kloudas
Hello, I would like to propose the addition of a dispose() method, in addition to the already existing close(), in the RichFunction interface. This will align the lifecycle of a RichFunction, with that of an Operator. After this, the code paths followed when finishing successfully and when ca

Re: [DISCUSS] Adding a dispose() method in the RichFunction.

2016-11-10 Thread Fabian Hueske
RichFunctions are used in the DataStream and DataSet APIs. How would that change affect the DataSet API? Best, Fabian 2016-11-10 11:37 GMT+01:00 Kostas Kloudas : > Hello, > > I would like to propose the addition of a dispose() method, in addition to > the > already existing close(), in the Rich

Re: [DISCUSS] Adding a dispose() method in the RichFunction.

2016-11-10 Thread Kostas Kloudas
I suppose that given that in the DataSet API there is no semantic distinction between the two, we just have to make sure that whenever close() is called, the new dispose() is called as well. Best, Kostas > On Nov 10, 2016, at 11:47 AM, Fabian Hueske wrote: > > RichFunctions are used in the

Re: [DISCUSS] FLIP-5 Only send data to each taskmanager once for broadcasts

2016-11-10 Thread Felix Neutatz
Hi everybody, the previous approach turned out to have an issue. Since we only write to one subpartition, we have N-1 empty subpartitions per Task (where N = degree of parallelism). In the current approach I didn't consume these empty subpartitions. When you don't consume a subpartition it won't b

Re: Flink ML recommender system API

2016-11-10 Thread Gábor Hermann
Hi all, We have managed to fit the ranking recommendation evaluation into the evaluation framework proposed by Thedore (FLINK-2157). There's one main problem, that remains: we have to different predictor traits (Predictor, RankingPredictor) without a common superclass, and that might be probl

RocksDB IO error

2016-11-10 Thread zhenhao.li
Hi team, Could you please investigate on this? It seems that we may face data loss when using RocksDB with enableFullyAsyncSnapshots. It was Flink 1.1.3. Regards, Zhenhao 2016-11-03 13:42:47,071 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph- Source: Custom Source -> (

Re: [DISCUSS] Proposed updates to Flink project site

2016-11-10 Thread Mike Winters
Thanks everyone, I appreciate the comments. As for new content: aside from the graphic on the homepage, the only significant addition will be an introductory writeup [1]. This doc has commenting enabled, so please add feedback as you see fit. Note that in this version of mockups, the current 'Feat

Re: [DISCUSS] Proposed updates to Flink project site

2016-11-10 Thread Ufuk Celebi
On 10 November 2016 at 14:46:57, Mike Winters (mwi...@gmail.com) wrote: > > @Max, I was also unsure about the left-hand nav as it's a substantial > change from the status quo, however, I believe it provides more > guidance on > a 'logical' order for a new user to move through the site. Definitely

Re: Flink ML recommender system API

2016-11-10 Thread Theodore Vasiloudis
Hello Gabor, for this type of issue (design decisions) what we've done in the past with FlinkML is to open a PR marked with the WIP tag and take the discussion there, making it easier for people to check out the code and get a feel of advantages/disadvantages of different approaches. Could you do

Re: Flink ML recommender system API

2016-11-10 Thread Gábor Hermann
Hello Theodore, Thanks for your reply. Of course. I would have done that in the first place but I had seen the contribution guideline advising to avoid WIP PRs: "No WIP pull requests. We consider pull requests as requests to merge the referenced code as is into the current stable master bran

[DISCUSS] FLIP-14: Loops API and Termination

2016-11-10 Thread Paris Carbone
Hi again Flink folks, Here is our new proposal that addresses Job Termination - the loop fault tolerance proposal will follow shortly. As Stephan hinted, we need operators to be aware of their scope level. Thus, it is time we make loops great again! :) Part of this FLIP basically introduces a n

[jira] [Created] (FLINK-5049) Instability in QueryableStateITCase.testQueryableStateWithTaskManagerFailure

2016-11-10 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-5049: -- Summary: Instability in QueryableStateITCase.testQueryableStateWithTaskManagerFailure Key: FLINK-5049 URL: https://issues.apache.org/jira/browse/FLINK-5049 Project: Flink

Travis CI

2016-11-10 Thread Greg Hogan
We're getting the dreaded "The job exceeded the maximum time limit for jobs, and has been terminated." error for some recent Travis-CI builds. https://travis-ci.org/apache/flink/builds/174615801 The docs state that termination will occur when "A job takes longer than 50 minutes on travis-ci.org"

Re: Travis CI

2016-11-10 Thread Ufuk Celebi
I also just noticed it today. We used to work with the 120 minutes limit and the builds took way longer as you said. I don't know what's going on here... It might be related to some issues they reported a few hours ago (https://twitter.com/traviscistatus), but I can't tell. I really hope that t

Re: [DISCUSS] Releasing Flink 1.1.4

2016-11-10 Thread Ufuk Celebi
The last fixes are finally in. Thanks to everyone who participated in the discussion. I will now create the release artifacts and start the vote tomorrow (CET). – Ufuk On 8 November 2016 at 19:02:46, Stephan Ewen (se...@apache.org) wrote: > I opened a pull request for the backport of [FLINK-490

Re: [FLINK-4541] Support for SQL NOT IN operator

2016-11-10 Thread Fabian Hueske
Hi Alexander, Thanks for looking into this issue! We did not support CROSS JOIN on purpose because the general case is very expensive to compute. Also as you noticed we would have to make sure that inner-joins are preferred over cross joins (if possible). Cost-based optimizers (such as Calcite's

[jira] [Created] (FLINK-5050) JSON.org license is CatX

2016-11-10 Thread Ted Yu (JIRA)
Ted Yu created FLINK-5050: - Summary: JSON.org license is CatX Key: FLINK-5050 URL: https://issues.apache.org/jira/browse/FLINK-5050 Project: Flink Issue Type: Bug Reporter: Ted Yu We sh

Re: [DISCUSS] FLIP-14: Loops API and Termination

2016-11-10 Thread SHI Xiaogang
Hi Paris I have several concerns about the correctness of the termination protocol. I think the termination protocol put an end to the computation even when the computation has not converged. Suppose there exists a loop context constructed by a OP operator, a Head operator and a Tail operator (il

Flink using Yarn on MapR

2016-11-10 Thread Naveen Tirupattur
Hi, I am trying to setup flink with yarn on MapR. I built flink using the following command and build finished successfully. mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=2.7.0-mapr-1607 -Dhadoop.vendor=MapR Now when I try to start yarn session I am seeing the below error 2016