Re: [DISCUSS] Hold copies in HeapStateBackend

2016-11-22 Thread Fabian Hueske
Does anybody have objections against copying the first record that goes into the ReduceState? 2016-11-22 12:49 GMT+01:00 Aljoscha Krettek : > That's right, yes. > > On Mon, 21 Nov 2016 at 19:14 Fabian Hueske wrote: > > > Right, but that would be a much bigger change

Re: [DISCUSS] Merge batch and stream connector modules

2016-11-22 Thread Fabian Hueske
Hi all, should we do this refactoring for the 1.2 release? If yes, I'll prepare a PR for that. Cheers, Fabian 2016-09-26 13:55 GMT+02:00 Fabian Hueske : > Thanks everybody for your comments. > > I opened FLINK-4676 [1] for merging the connector modules. > > [1] https://is

Re: [RESULT] [VOTE] Release Apache Flink 1.1.4 (RC1)

2016-11-21 Thread Fabian Hueske
(2) We merged a fix that changes the YARN class loader behaviour. I > > > >> think it's better to > > > >>> revert that change since changed class loading behaviour with minor > > > >> releases can be > > > >>> quite unexpected:

Re: [DISCUSS] Hold copies in HeapStateBackend

2016-11-21 Thread Fabian Hueske
roblem anymore. > > So I'm very much in favour of keeping data serialised. Copying data would > only ever be a stopgap solution. > > On Mon, 21 Nov 2016 at 15:56 Fabian Hueske wrote: > > > Another approach that would solve the problem for our use case (object > > re-

Re: [DISCUSS] Hold copies in HeapStateBackend

2016-11-21 Thread Fabian Hueske
13:48, Aljoscha Krettek wrote: > > > >> Hi, > >> I would be in favour of this since it brings things in line with the > >> RocksDB backend. This will, however, come with quite the performance > >> overhead, depending on how fast the TypeSerializer can copy. >

[DISCUSS] Hold copies in HeapStateBackend

2016-11-21 Thread Fabian Hueske
Hi everybody, when implementing a ReduceFunction for incremental aggregation of SQL / Table API window aggregates we noticed that the HeapStateBackend does not store copies but holds references to the original objects. In case of a SlidingWindow, the same object is referenced from different window

[jira] [Created] (FLINK-5094) Support RichReduceFunction and RichFoldFunction as incremental window aggregation functions

2016-11-18 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-5094: Summary: Support RichReduceFunction and RichFoldFunction as incremental window aggregation functions Key: FLINK-5094 URL: https://issues.apache.org/jira/browse/FLINK-5094

Re: [DISCUSS] FLIP-14: Loops API and Termination

2016-11-17 Thread Fabian Hueske
Hi Paris, just gave you the permissions (I hope). Let me know if something does not work. Cheers, Fabian 2016-11-17 13:48 GMT+01:00 Paris Carbone : > We do not have to schedule this for an early Flink release, just saying. > I would just like to get the changes out and you people can review it

Re: Request for permissions in JIRA

2016-11-17 Thread Fabian Hueske
Welcome Boris, I gave you JIRA contributor permissions (you can now assign issues to yourself) and assigned FLINK-5006 to you. Thanks, Fabian 2016-11-17 8:42 GMT+01:00 Boris Osipov : > Hi folks! > I've done https://issues.apache.org/jira/browse/FLINK-5006 and want to > resolve it. Could you ple

[jira] [Created] (FLINK-5084) Replace Java Table API integration tests by unit tests

2016-11-16 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-5084: Summary: Replace Java Table API integration tests by unit tests Key: FLINK-5084 URL: https://issues.apache.org/jira/browse/FLINK-5084 Project: Flink Issue

Re: [VOTE] Release Apache Flink 1.1.4 (RC1)

2016-11-14 Thread Fabian Hueske
+1 - Checked hashes and signatures of release artifacts - Checked commit diff against v1.1.3: Did not find any added or changed dependencies - Built from source with default settings for Scala 2.10 and 2.11 on MacOS, Java 8 Cheers, Fabian 2016-11-13 14:39 GMT+01:00 Gyula Fóra : > Hi, > > +1 fro

Re: issue assignment (FLINK-5002)

2016-11-14 Thread Fabian Hueske
Hi Roman, welcome to the Flink community. I assigned the issue to you and gave you Contributor permissions in JIRA. You can now assign issues to yourself. Best, Fabian 2016-11-14 12:45 GMT+01:00 Roman Maier : > Hi folks. > > I want to do something for Flink. As the first step I have started doi

Re: [FLINK-4541] Support for SQL NOT IN operator

2016-11-11 Thread Fabian Hueske
the Calcite team will change a logical plan for NOT IN and > we will be back to this issue. > > Regards, > Alexander > > -Original Message- > From: Fabian Hueske [mailto:fhue...@gmail.com] > Sent: Friday, November 11, 2016 2:04 AM > To: dev@flink.apache.org > Subjec

Re: [FLINK-4541] Support for SQL NOT IN operator

2016-11-10 Thread Fabian Hueske
Hi Alexander, Thanks for looking into this issue! We did not support CROSS JOIN on purpose because the general case is very expensive to compute. Also as you noticed we would have to make sure that inner-joins are preferred over cross joins (if possible). Cost-based optimizers (such as Calcite's

Re: [DISCUSS] Adding a dispose() method in the RichFunction.

2016-11-10 Thread Fabian Hueske
RichFunctions are used in the DataStream and DataSet APIs. How would that change affect the DataSet API? Best, Fabian 2016-11-10 11:37 GMT+01:00 Kostas Kloudas : > Hello, > > I would like to propose the addition of a dispose() method, in addition to > the > already existing close(), in the Rich

Re: Apache Flink and Kudu integration

2016-11-10 Thread Fabian Hueske
Hi Ruben, that sounds great! In case you are planning to contribute your connector, you should have a look at Apache Bahir [1]. Bahir is a project that collects connectors and other extensions of distributed analytics platforms (currently Flink and Spark). As of now, it offers Flink connectors to

Re: Left outer join

2016-11-08 Thread Fabian Hueske
This should work as well: input .leftOuterJoin(metrics) .where("*") .equalTo(0) { (left,right) => ... } 2016-11-08 23:10 GMT+01:00 Till Rohrmann : > Hi Thomas, > > either you map your input data set into a tuple input.map(x => Tuple1(x)) > or you specify a key selector leftOuterJoin(me

[jira] [Created] (FLINK-5031) Consecutive DataStream.split() ignored

2016-11-07 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-5031: Summary: Consecutive DataStream.split() ignored Key: FLINK-5031 URL: https://issues.apache.org/jira/browse/FLINK-5031 Project: Flink Issue Type: Bug

Re: Stream processing via Kafka using Android

2016-11-07 Thread Fabian Hueske
Hi Artur, I'm not sure if Flink is the best tool to write data from Android Apps to Kafka. Flink would be a good choice to process such data by reading it from Kafka. I would reach out to the Kafka user list and seek for advice there. Best, Fabian Btw. The Apache dev@ mailing lists are meant to

Re: [FLINK-3848] Add ProjectableTableSource

2016-11-02 Thread Fabian Hueske
Hi Anton, a regular TableSource does not accept a predicate and return the whole table. A ProjectableTableSource is able to evaluate a predicate while scanning. TableSources that evaluate predicates while (or rather before) scanning can significantly reduce IO compared to a full scan. Sources tha

Re: Assign a unique id to each line of a dataset

2016-11-02 Thread Fabian Hueske
Hi Thomas, have a look at DataSetUtils [1]. Best, Fabian [1] https://github.com/apache/flink/blob/master/flink-java/src/main/java/org/apache/flink/api/java/utils/DataSetUtils.java 2016-11-02 13:17 GMT+01:00 Thomas FOURNIER : > Hello, > > Is it possible with the current Flink-API to give a uniq

Re: [DISCUSS] FLIP-11: Table API Stream Aggregations

2016-11-02 Thread Fabian Hueske
g the > design. I hope to see a powerful, concise, and compatible TableAPI. > I agree with your other comments on Flip11. > > Regards, > Shaoxuan > > > On Fri, Oct 28, 2016 at 11:09 PM, Fabian Hueske wrote: > > > Thanks for bringing up this point Stephan. > > > &

Re: [DISCUSS] Make FieldAccessor logic consistent with remaining API

2016-10-28 Thread Fabian Hueske
d be overkill, because we don't need > to set multiple fields at a time. > > It certainly seems to be a good idea to unify these at least > externally (from the side of the users), but I'm not sure what would > be a neat way to also internally unify them. > > Best, >

Re: [DISCUSS] Schedule and Scope for Flink 1.2

2016-10-28 Thread Fabian Hueske
t;> > >>>> Do we also want to get in the "Trigger DSL" that we've had brewing > for a > >>>> while now? > >>>> > >>>> On Mon, 17 Oct 2016 at 16:17 Stephan Ewen wrote: > >>>> > >>>>> I thi

Re: trouble with type casting

2016-10-28 Thread Fabian Hueske
Hi Anton, it seems that the Table API validation is more strict than Calcite's SQL validator (maybe because it is not aware of the actual implementation). In principle, it is correct to prevent the auto-casting from BigDecimal to double. I think it is fine to request an explicit cast from users (

Re: [DISCUSS] FLIP-11: Table API Stream Aggregations

2016-10-28 Thread Fabian Hueske
background (or other streaming APIs) and it is more > > discoverable in the IDE. > > > > - I like the idea of having separate ".window()" and ".rowWindow()" > > clauses. Makes it more explicit that very different things will happen. > > > > -

Re: [Discuss] FLIP-13 Side Outputs in Flink

2016-10-27 Thread Fabian Hueske
it just related to stream api? This feature could be really useful for > etl scenarios with dataset api as well. > > On Oct 26, 2016 22:29, "Fabian Hueske" wrote: > > > Hi Chen, > > > > thanks for this interesting proposal. I think side output would be a very >

[jira] [Created] (FLINK-4937) Add incremental group window aggregation for streaming Table API

2016-10-26 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4937: Summary: Add incremental group window aggregation for streaming Table API Key: FLINK-4937 URL: https://issues.apache.org/jira/browse/FLINK-4937 Project: Flink

Re: DataStream#explain

2016-10-26 Thread Fabian Hueske
Hi Anton, I think you can do it similar as the BatchTableEnvironment#explain(table: Table, extended: Boolean) which calls ExecutionEnvironment.getExecutionPlan(). StreamExecutionPlan does also have a getExecutionPlan() method. The stream execution plan will probably contain different information t

Re: [Discuss] FLIP-13 Side Outputs in Flink

2016-10-26 Thread Fabian Hueske
Hi Chen, thanks for this interesting proposal. I think side output would be a very valuable feature to have! I went of the FLIP and have a few questions. - Will multiple side outputs of the same type be supported? - If I got it right, the FLIP proposes to change the signatures of many user-defin

[DISCUSS] Make FieldAccessor logic consistent with remaining API

2016-10-26 Thread Fabian Hueske
Hi everybody, while reviewing PR #2094 [1] I noticed that the field reference syntax for FieldAccessors is not compatible with the syntax supported for key definitions (ExpressionKeys) used in groupBy(), keyBy(), join().where().equalTo(), etc. FieldAccessors are only used for build-in aggregation

Re: Flink 1.2-SNAPSHOT Bucketing Sink problem

2016-10-26 Thread Fabian Hueske
e way, the BucketingSink function is really good at sinking to hdfs > (Especially for event time file naming). However, currently in version > 1.1.3, the RollingSink function is used. Is there any plan to release the > BucketingSink function in stable version? > > _______

Re: Flink 1.2-SNAPSHOT Bucketing Sink problem

2016-10-26 Thread Fabian Hueske
Hi Ozan, a NoSuchMethodError indicates a version mismatch. Since you are using a SNAPSHOT build it is likely that dependencies changed when you recompiled Flink or your job. You should make sure that you are using the same version (on SNAPSHOT the same commit) for jobs and cluster. Best, Fabian

Re: [FLINK-2118] Table API fails on composite filter conditions

2016-10-21 Thread Fabian Hueske
Hi Alexander, I agree and will close the issue. Thanks for looking into this! Best, Fabian 2016-10-21 16:37 GMT+02:00 Alexander Shoshin : > Hi everybody. > > I propose to close this issue: > https://issues.apache.org/jira/browse/FLINK-2118 > as "not a bug", because we can't change scala behavio

Re: [DISCUSS] Defining the Semantics of StreamingSQL

2016-10-20 Thread Fabian Hueske
he implicit END_OF_WINDOW and add the possibility to configure timestamps later. Best, Fabian 2016-10-18 16:43 GMT+02:00 Fabian Hueske : > Hi everybody, > > at Flink Forward we had a BOF session about StreamSQL. > After the conference, some folks and I sat down and drafted a propos

[jira] [Created] (FLINK-4864) Shade Calcite dependency in flink-table

2016-10-20 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4864: Summary: Shade Calcite dependency in flink-table Key: FLINK-4864 URL: https://issues.apache.org/jira/browse/FLINK-4864 Project: Flink Issue Type

Re: Type erasure problem solely on cluster execution

2016-10-19 Thread Fabian Hueske
Hi Martin, thanks for reporting the problem and providing code to reproduce it. Would you mind to describe the problem with the forwarding annotations in more detail? I would be interested in the error message and how the semantic annotation is provided (@ForwardFields or withForwardedFields()).

[DISCUSS] Defining the Semantics of StreamingSQL

2016-10-18 Thread Fabian Hueske
Hi everybody, at Flink Forward we had a BOF session about StreamSQL. After the conference, some folks and I sat down and drafted a proposal for Flink's StreamSQL semantics. --> https://docs.google.com/document/d/1qVVt_16kdaZQ8RTfA_f4konQPW4tnl8THw6rzGUdaqU The proposal includes: - A definition f

[DISCUSS] Schedule and Scope for Flink 1.2

2016-10-17 Thread Fabian Hueske
Hi everybody, Flink 1.1.0 was released in August and I think it is time to kick off a discussion about the schedule and scope of Flink 1.2.0. == Scope We started to collect features for Flink 1.2.0 in the Flink Release wiki page [1]. I copy the feature list for convenience: - Dynamic Scaling

Re: 答复: RE:[DISCUSS] FLIP-11: Table API Stream Aggregations

2016-10-14 Thread Fabian Hueske
; > +1 to change groupBy() to partitionBy(). I would not move partitionBy() into the RowWindow definition but keep it outside to ensure only one partitioning is defined. The orderBy definition is already fluently included in the RowWindow via the on() method. > Regards, > Shaoxuan > &g

[DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment

2016-10-14 Thread Fabian Hueske
Hi everybody, I would like to propose to deprecate the utility methods to read data with Hadoop InputFormats from the (batch) ExecutionEnvironment. The motivation for deprecating these methods is reduce Flink's dependency on Hadoop but rather have Hadoop as an optional dependency for users that a

Re: [DISCUSS] Drop Hadoop 1 support with Flink 1.2

2016-10-14 Thread Fabian Hueske
ve the methods from the > ExecutionEnvironment. > > On Fri, Oct 14, 2016 at 10:45 AM, Fabian Hueske wrote: > > > Yes, I'm also +1 for removing the methods at some point. > > > > For 1.2 we could go ahead and move the Hadoop-MR connectors into a > separate > &

Re: [DISCUSS] Drop Hadoop 1 support with Flink 1.2

2016-10-14 Thread Fabian Hueske
istribution. > > On Fri, Oct 14, 2016 at 10:37 AM, Fabian Hueske wrote: > > > +1 for dropping Hadoop1 support. > > > > Regarding a binary release without Hadoop: > > > > What would we do about the readHadoopFile() and createHadoopInput() on > the > >

Re: [DISCUSS] Drop Hadoop 1 support with Flink 1.2

2016-10-14 Thread Fabian Hueske
+1 for dropping Hadoop1 support. Regarding a binary release without Hadoop: What would we do about the readHadoopFile() and createHadoopInput() on the ExecutionEnvironment? These methods are declared as @PublicEvolving, so we did not commit to keep them. However that does not necessarily mean we

Re: How to contribute to Streaming Table API and StreamSQL

2016-10-13 Thread Fabian Hueske
ctly defining the user aggregates in that SQL syntax instead of the > JSON configuration that we now have for the purpose.) > > Cheers, > Juho > > On Fri, Jun 17, 2016 at 2:26 PM, Fabian Hueske wrote: > > > If we want to have it in Stream SQL yes. Although we can also

Re: 答复: RE:[DISCUSS] FLIP-11: Table API Stream Aggregations

2016-10-13 Thread Fabian Hueske
n are not implemented in Calcite yet. There is CALCITE-1345 [1] to track the issue. As Shaoxuan mentioned, we are not using the STREAM keyword to be SQL compliant. Best, Fabian [1] https://issues.apache.org/jira/browse/CALCITE-1345 2016-10-13 12:05 GMT+02:00 Fabian Hueske : > Hi everybody, &

Re: 答复: RE:[DISCUSS] FLIP-11: Table API Stream Aggregations

2016-10-13 Thread Fabian Hueske
uot;select" operator. This will make the TableAPI in the similar manner as SQL. > For instance,[A groupby without window] > > val res = tab > .groupBy(‘a) > .select(‘a, ‘b.sum) > > SELECT a, SUM(b) > FROM tab > GROUP BY a > [A tumble window inside groupby] > val res

Re: java.lang.IllegalArgumentException: JDBC-Class not found. - org.postgresql.jdbc.Driver

2016-10-12 Thread Fabian Hueske
Hi Sunny, please avoid crossposting to all mailing lists. The dev@f.a.o list is for issues related to the development of Flink not the development of Flink applications. The error message is actually quite descriptive. Flink does not find the JDBC driver class. You need to add it to the classpath

Re: [VOTE] Release Apache Flink 1.1.3 (RC2)

2016-10-12 Thread Fabian Hueske
+1 to release (binding) - checked hashes and signatures - checked diffs against 1.1.2: no dependencies added or modified - successfully built Flink from source archive - mvn clean install (Scala 2.10) Cheers, Fabian 2016-10-12 14:05 GMT+02:00 Maximilian Michels : > +1 (binding) > > - scanned

Re: Assign issue

2016-10-12 Thread Fabian Hueske
Hi Alexander, Welcome to the Flink community. I gave you contributor permissions for JIRA (you can assign issues to yourself) and assigned the issue to you. Looking forward to your contribution. Best, Fabian 2016-10-12 13:14 GMT+02:00 Alexander Shoshin : > Hi. > > I want to do something for fl

Re: Re: Re: Flink Gelly

2016-10-07 Thread Fabian Hueske
Hi, the exception > java.lang.RuntimeException: Memory ran out. Compaction failed. says that the hash table ran out of memory. Gelly is implemented on top of Flink's DataSet API. So this would rather be a problem with DataSet than Gelly. I think Vasia is right about the memory configuration. Pro

Re: Assign issue

2016-10-07 Thread Fabian Hueske
Hi Anton, I gave you Contributor permissions for JIRA and assigned the issue to you. You can now also assign other issues to yourself. Best, Fabian 2016-10-07 10:09 GMT+02:00 Anton Solovev : > Hi folks. > > > I want to do something for flink. As the first step I have started doing > https://iss

Re: [VOTE] Release Apache Flink 1.1.3 (RC1)

2016-10-06 Thread Fabian Hueske
+1 to release (binding) - checked hashes and signatures - checked diffs against 1.1.2: no dependencies added or modified - successfully built Flink from source archive (Maven 3.3.3, Java 1.8.0_25 Oracle, OS X) - mvn clean install (Scala 2.10) - mvn clean install (Scala 2.11) - mvn clean inst

Re: Duplicate sort keys

2016-10-05 Thread Fabian Hueske
Hi Greg, IMO you are right. We should remove duplicate sort keys. Best, Fabian 2016-10-03 16:04 GMT+02:00 Greg Hogan : > Is it correct to expect that Flink should remove duplicate sort keys? I'm > working on instrumenting the FixedLengthRecordSorter (FLINK-4705) and the > following test case fr

Re: Releasing Flink 1.1.3

2016-10-04 Thread Fabian Hueske
Thanks Ufuk for stepping up as release manager! Yes, I will backport the fix for FLINK-4311 to Flink 1.1.3 and merge it today. 2016-10-04 12:07 GMT+02:00 Ufuk Celebi : > If there are no objections I would like to be the release manager for > this release. > > Futhermore, I would like to add FLIN

[DISCUSS] Bump HBase version for hadoop-2 to 1.2.3

2016-10-04 Thread Fabian Hueske
Hi everybody, Flink's TableInputFormat depends on a very old HBase dependency (0.98.11). We have received user requests (see FLINK-2765 [1]) to update the dependency for hadoop-2 to 1.2. In addition there is a pull request with critical fixes for the HBase TableInputFormat [2] that bumps the vers

Re: Releasing Flink 1.1.3

2016-09-29 Thread Fabian Hueske
Sounds good to me. Can we include a fix for FLINK-4311 as well? There is a PR [1] which is good to go IMO modulo resetting the updated HBase dependency for the 1.1.3 fix. For 1.2.0 we should start a discussion about bumping the version. Cheers, Fabian [1] https://github.com/apache/flink/pull/233

[jira] [Created] (FLINK-4688) Optimizer hangs for hours when optimizing complex plans

2016-09-26 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4688: Summary: Optimizer hangs for hours when optimizing complex plans Key: FLINK-4688 URL: https://issues.apache.org/jira/browse/FLINK-4688 Project: Flink Issue

[jira] [Created] (FLINK-4678) Add SessionRow row-windows for streaming tables (FLIP-11)

2016-09-26 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4678: Summary: Add SessionRow row-windows for streaming tables (FLIP-11) Key: FLINK-4678 URL: https://issues.apache.org/jira/browse/FLINK-4678 Project: Flink

[jira] [Created] (FLINK-4683) Add SlideRow row-windows for batch tables

2016-09-26 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4683: Summary: Add SlideRow row-windows for batch tables Key: FLINK-4683 URL: https://issues.apache.org/jira/browse/FLINK-4683 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-4682) Add TumbleRow row-windows for batch tables.

2016-09-26 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4682: Summary: Add TumbleRow row-windows for batch tables. Key: FLINK-4682 URL: https://issues.apache.org/jira/browse/FLINK-4682 Project: Flink Issue Type: Sub

[jira] [Created] (FLINK-4681) Add SessionRow row-windows for batch tables.

2016-09-26 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4681: Summary: Add SessionRow row-windows for batch tables. Key: FLINK-4681 URL: https://issues.apache.org/jira/browse/FLINK-4681 Project: Flink Issue Type: Sub

[jira] [Created] (FLINK-4680) Add SlidingRow row-windows for streaming tables

2016-09-26 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4680: Summary: Add SlidingRow row-windows for streaming tables Key: FLINK-4680 URL: https://issues.apache.org/jira/browse/FLINK-4680 Project: Flink Issue Type

[jira] [Created] (FLINK-4679) Add TumbleRow row-windows for streaming tables

2016-09-26 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4679: Summary: Add TumbleRow row-windows for streaming tables Key: FLINK-4679 URL: https://issues.apache.org/jira/browse/FLINK-4679 Project: Flink Issue Type: Sub

Re: [DISCUSS] FLIP-11: Table API Stream Aggregations

2016-09-26 Thread Fabian Hueske
nks for sharing your ideas. >>> >>> They all make sense to me. Regarding to reassigning timestamp, I do not >>> have an use case. I come up with this because DataStream has a >>> TimestampAssigner :) >>> >>> +1 for this FLIP. >>> >&g

Re: [DISCUSS] Merge batch and stream connector modules

2016-09-26 Thread Fabian Hueske
> > > > > > > > +1 for Fabian's suggestion > > > > > > > > > > On Thu, Sep 22, 2016 at 3:25 PM, Swapnil Chougule < > > > > the.swapni...@gmail.com > > > > > > > > > > &g

[jira] [Created] (FLINK-4676) Merge flink-batch-con

2016-09-26 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4676: Summary: Merge flink-batch-con Key: FLINK-4676 URL: https://issues.apache.org/jira/browse/FLINK-4676 Project: Flink Issue Type: Task Reporter

Re: Assign issue

2016-09-23 Thread Fabian Hueske
Hi Evgeny, Welcome to the Flink community. I gave you Contributor permissions in JIRA and assigned the issue to you. In the future you can assign issues to yourself as well. Best, Fabian 2016-09-23 12:10 GMT+02:00 Evgeny Kincharov : > Hi folks. > > I want to do something for flink. As the first

Re: Flink scala data sink code from CSV to postgres.

2016-09-23 Thread Fabian Hueske
Hi Jagan, could you please stop spamming the mailing list? This is at least the third mail with the identical request in a very short time. You already got an answer including a Java code snippet. Translating this into Scala code should be possible. The mailing list is for helping people such tha

[DISCUSS] Merge batch and stream connector modules

2016-09-22 Thread Fabian Hueske
Hi everybody, right now, we have two separate Maven modules for batch and streaming connectors (flink-batch-connectors and flink-streaming-connectors) that contain modules for the individual external systems and storage formats such as HBase, Cassandra, Avro, Elasticsearch, etc. Some of these sys

Re: Assign issue

2016-09-20 Thread Fabian Hueske
Hi Anton, I gave your JIRA account Contributor permissions. You can now assign Flink issues to yourself. You can start a discussion either on the dev mailing list (usually done for major changes and new features) or in the JIRA issue (preferred for minor improvements). Best, Fabian 2016-09-20 14

[jira] [Created] (FLINK-4640) Serialization of the initialValue of a Fold on WindowedStream fails

2016-09-20 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4640: Summary: Serialization of the initialValue of a Fold on WindowedStream fails Key: FLINK-4640 URL: https://issues.apache.org/jira/browse/FLINK-4640 Project: Flink

Re: [DISCUSS] Proposal for Asynchronous I/O in FLINK

2016-09-19 Thread Fabian Hueske
Hi David, thanks for the FLIP! It looks pretty good. A few questions / suggestions: - It would be good to make the number of concurrent AsyncFunction calls configurable. Maybe overload the unorderedWait and orderedWait methods with an additional int parameter? - Do you plan to also add a RichFunc

[jira] [Created] (FLINK-4636) AbstractCEPPatternOperator fails to restore state

2016-09-19 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4636: Summary: AbstractCEPPatternOperator fails to restore state Key: FLINK-4636 URL: https://issues.apache.org/jira/browse/FLINK-4636 Project: Flink Issue Type

Re: Performance and Latency Chart for Flink

2016-09-16 Thread Fabian Hueske
Hi, I am not aware of periodic performance runs for the Flink releases. I know a few benchmarks which have been published at different points in time like [1], [2], and [3] (you'll probably find more). In general, fair benchmarks that compare different systems (if there is such thing) are very di

Re: Re: [DISCUSS] how choose Scala and Java

2016-09-07 Thread Fabian Hueske
The Java and Scala APIs are organized in different Maven modules and the Scala APIs are based on the respective Java API. The benefit of this design is to keep Scala dependencies out of the Java APIs which is requested by many users. The Java and Scala counterparts of the DataSet and DataStream AP

Re: Assign tasks from JIRA

2016-09-07 Thread Fabian Hueske
Hi Kirill, welcome to the Flink community! I gave your JIRA account Contributor permissions for the FLINK JIRA. You should now be able to assign tasks to yourself. Best, Fabian 2016-09-07 9:46 GMT+02:00 Kirill Morozov : > Hello folks! > > I am novice in community, but I want to

Re: [DISCUSS] FLIP-11: Table API Stream Aggregations

2016-09-06 Thread Fabian Hueske
emtime" is used we have to offer a "allowLateness" method because > we have to assume that this attribute can also be the batch event time > column, which is not very nice. > > > > class TumblingWindow(size: Expression) extends Window { > > def on(timeField: Expr

Re: [DISCUSS] FLIP-11: Table API Stream Aggregations

2016-09-05 Thread Fabian Hueske
Hi Jark, you had asked for non-windowed aggregates in the Table API a few times. FLIP-11 proposes row-window aggregates which are a generalization of running aggregates (SlideRow unboundedPreceding). Can you have a look at the FLIP and give feedback whether this is what you are looking for? Impro

Re: Reducing the JIRA PR message verbosity

2016-09-02 Thread Fabian Hueske
+1 Thanks Max! 2016-09-02 15:20 GMT+02:00 Stephan Ewen : > Sounds good to me! > > On Fri, Sep 2, 2016 at 3:08 PM, Maximilian Michels wrote: > > > If there are no objections, I will contact Infra to change the GitHub > > JIRA notifications as follows: > > > > Jira comments section > > - initia

Extending FLIP template

2016-09-01 Thread Fabian Hueske
Hi, I'm currently preparing a FLIP for Table API streaming aggregates and noticed that there is no section about how the task can be divided into subtasks. I think it would make sense to extend the template by a section "Work Plan" or "Implementation Plan" that explains in which steps or subtask

Re: Streaming - memory management

2016-08-31 Thread Fabian Hueske
Hi Vinaj, if you use user-defined state, you have to manually clear it. Otherwise, it will stay in the state backend (heap or RocksDB) until the job goes down (planned or due to an OOM error). This is esp. important to keep in mind, when using keyed state. If you have an unbounded, evolving key s

Re: [VOTE] Release Apache Flink 1.1.2 (RC1)

2016-08-31 Thread Fabian Hueske
I compared the 1.1.2-rc1 branch with the 1.1.1 release tag and went over the diff. - No new dependencies were added - Versions of existing dependencies were not changed. - Did not notice anything that would block the release. +1 to release 1.1.2 2016-08-31 9:46 GMT+02:00 Robert Metzger : > +1 t

[jira] [Created] (FLINK-4513) Kafka connector documentation refers to Flink 1.1-SNAPSHOT

2016-08-26 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4513: Summary: Kafka connector documentation refers to Flink 1.1-SNAPSHOT Key: FLINK-4513 URL: https://issues.apache.org/jira/browse/FLINK-4513 Project: Flink

Re: [DISCUSS] Releasing Flink 1.1.2

2016-08-26 Thread Fabian Hueske
+1 for a 1.1.2 release. I'd like to have the Joda time exclusion removed from the Quickstart POM to include the dependency in the fat jar. Otherwise jobs with that dependency (such as the Flink training examples) fail with ClassNotFound. Robert fix that on the side in https://github.com/apache/fl

[jira] [Created] (FLINK-4480) Incorrect link to elastic.co in documentation

2016-08-24 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4480: Summary: Incorrect link to elastic.co in documentation Key: FLINK-4480 URL: https://issues.apache.org/jira/browse/FLINK-4480 Project: Flink Issue Type: Bug

Re: [DISCUSS] Some thoughts about unify Stream SQL and Batch SQL grammer

2016-08-24 Thread Fabian Hueske
ute our work back to Flink. I will look into it and > create JIRAs next days. > > - Jark Wu > > > 在 2016年8月24日,上午12:13,Fabian Hueske 写道: > > > > Hi Jark, > > > > We can think about removing the STREAM keyword or not. In principle, > > Calcite should al

Re: [DISCUSS] Some thoughts about unify Stream SQL and Batch SQL grammer

2016-08-23 Thread Fabian Hueske
gn. > > +1 for creating FLIP > > [1] https://docs.google.com/document/d/15iVc1781dxYWm3loVQlESYvMAxEzb > buVFPZWBYuY1Ek > > > - Jark Wu > > > 在 2016年8月23日,下午3:47,Fabian Hueske 写道: > > > > Hi, > > > > I did a bit of prototyping yesterday to ch

[jira] [Created] (FLINK-4453) Scala code example in Window documentation shows Java

2016-08-23 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4453: Summary: Scala code example in Window documentation shows Java Key: FLINK-4453 URL: https://issues.apache.org/jira/browse/FLINK-4453 Project: Flink Issue

Re: [DISCUSS] Some thoughts about unify Stream SQL and Batch SQL grammer

2016-08-23 Thread Fabian Hueske
document/d/19kSOAOINKCSWLBCKRq2WvNtmuaA9o3AyCh2ePqr3V5E 2016-08-19 11:04 GMT+02:00 Fabian Hueske : > Hi Jark, > > thanks for starting this discussion. Actually, I think we are rather > "blocked" on the internal handling of streaming windows in Calcite than the > SQL parser. IMO, it should be possible

Re: [FLINK-305] Code test coverage - how FLINK using it?

2016-08-19 Thread Fabian Hueske
Hi Pavel, the Cobertura plugin was removed in this PR: https://github.com/apache/flink/pull/454 Not sure if it was accidentally removed or on purpose. It was not included in the regular builds to reduce build time and AFAIK, it wasn't manually used either (otherwise somebody would have noticed tha

Re: [DISCUSS] Some thoughts about unify Stream SQL and Batch SQL grammer

2016-08-19 Thread Fabian Hueske
Hi Jark, thanks for starting this discussion. Actually, I think we are rather "blocked" on the internal handling of streaming windows in Calcite than the SQL parser. IMO, it should be possible to exchange or modify the parser if we want that. Regarding Calcite's StreamSQL syntax: Except for the S

Re: [ANNOUNCE] Flink 1.1.0 Released

2016-08-09 Thread Fabian Hueske
Thanks Ufuk and everybody who contributed to the release! Cheers, Fabian 2016-08-08 20:41 GMT+02:00 Henry Saputra : > Great work all. Great Thanks to Ufuk as RE :) > > On Monday, August 8, 2016, Stephan Ewen wrote: > > > Great work indeed, and big thanks, Ufuk! > > > > On Mon, Aug 8, 2016 at 6:

Re: Map Reduce Sorting

2016-08-04 Thread Fabian Hueske
eSort algorithm can run faster. > > > Are there any benefits if the map partitions are sorted? > > > Thank you > > > BR, > > Hilmi > > > Am 02.08.2016 um 10:01 schrieb Hilmi Yildirim: > >> Hi Fabian, >> >> thank you very much! This

Re: Map Reduce Sorting

2016-08-01 Thread Fabian Hueske
Hi Hilmi, the results of the combiner are usually not completely sorted and if they are this property is not leveraged. This is due to the following reasons: 1) a sort-combiner only sorts as much data as fits into memory. If there is more data, the result consists of multiple sorted sequences. 2)

AW: primitiveDefaultValue in CodeGenUtils in Table API

2016-06-29 Thread Fabian Hueske
Hi Cody, Aggregations are currently not performed by code-generated user functions. This would be a good improvement though. Check the DataSetAggregate class to learn how aggregations are translated into Flink Dataset programs. Best, Fabian Von: Cody Innowhere

Re: Flink Table & SQL doesn't work in very simple example

2016-06-23 Thread Fabian Hueske
Hi Jark Wu, yes, that looks like a dependency issue. Can you open a JIRA for it set "Fix Version" to 1.1.0. This issue should be resolved for the 1.1 release. Thanks, Fabian 2016-06-22 3:52 GMT+02:00 Jark Wu : > Hi, > > > I’m trying to use Flink Table 1.1-SNAPSHOT where I want to use Table API

Re: UDF in Flink table API

2016-06-23 Thread Fabian Hueske
Hi Cody, actually, there is no functionality to add a UDF in the Table API / SQL. This feature is definitely on the roadmap but not implemented, yet. The related JIRA is FLINK-3097. In fact, the FunctionCatalog and FrameworkConfig should not be publicly accessible from the TableEnvironment, IMO.

[jira] [Created] (FLINK-4088) Add interface to save and load TableSources

2016-06-17 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4088: Summary: Add interface to save and load TableSources Key: FLINK-4088 URL: https://issues.apache.org/jira/browse/FLINK-4088 Project: Flink Issue Type

[jira] [Created] (FLINK-4086) Hide internal Expression methods from Table API

2016-06-17 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4086: Summary: Hide internal Expression methods from Table API Key: FLINK-4086 URL: https://issues.apache.org/jira/browse/FLINK-4086 Project: Flink Issue Type

<    4   5   6   7   8   9   10   11   12   13   >