Re: [DISCUSS] Development of SQL OVER / Table API Row Windows for streaming tables

2017-01-25 Thread Shaoxuan Wang
Hi Liuxinchun, I am not sure where did you get the inception: anyone has suggested "to process Event time window in Sliding Row Window". If you were referring my post, there may be some misunderstanding there. I think you were asking the similar question as Hongyuhong. I have just replied to him.

Re: [DISCUSS] Development of SQL OVER / Table API Row Windows for streaming tables

2017-01-25 Thread Fabian Hueske
Hi everybody, thanks for the great discussions so far. It's awesome to see so much interest in this topic! First, I'd like to comment on the development process for this feature and later on the design of the runtime: Dev Process @Shaoxuan, I completely agree with you. We should first come

Re: [DISCUSS] Development of SQL OVER / Table API Row Windows for streaming tables

2017-01-25 Thread liuxinchun
I don't think it is a good idea to process Event time window in Sliding Row Window. In Sliding Time window, when an element is late, we can trigger the recalculation of the related windows. And the sliding period is coarse-gained, We only need to recalculate size/sliding number of windows. But

[jira] [Created] (FLINK-5636) IO Metric for StreamTwoInputProcessor

2017-01-25 Thread david.wang (JIRA)
david.wang created FLINK-5636: - Summary: IO Metric for StreamTwoInputProcessor Key: FLINK-5636 URL: https://issues.apache.org/jira/browse/FLINK-5636 Project: Flink Issue Type: Bug

Re: [VOTE] Release Apache Flink 1.2.0 (RC1)

2017-01-25 Thread Stefan Richter
Hi Robert, I found an potientally blocking issue in RC1 that I would like to bring up for debate. In [FLINK-5602], Ufuk found a NPE that happened during the first checkpoint after migrating a job from Flink 1.1 to 1.2. I looked into this and found that the cause of this problem is the absense

Re: flink-ml test

2017-01-25 Thread Theodore Vasiloudis
Hello Anton, I usually run specific local tests through IDEA, or test or the whole ML module (run mvn test in the flink-ml root dir) . It should be possible to run specific tests through maven [1], but I haven't been able to make this work. Which test is failing for you? [1]

Re: [DISCUSS] Development of SQL OVER / Table API Row Windows for streaming tables

2017-01-25 Thread Jinkui Shi
Hi, Fabian, Shaoxuan, Yuhong - OVER RANGE for processing time I think your design make sense. Only considering processing time will simplify the design, make it robust. The state will be saved in a queue, and the incoming data line will apply the given and user defined function one by one. Do

[jira] [Created] (FLINK-5637) Default Flink configuration contains whitespace characters, causing parser WARNings

2017-01-25 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5637: - Summary: Default Flink configuration contains whitespace characters, causing parser WARNings Key: FLINK-5637 URL: https://issues.apache.org/jira/browse/FLINK-5637

[jira] [Created] (FLINK-5639) Clarify License implications of RabbitMQ Connector

2017-01-25 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-5639: --- Summary: Clarify License implications of RabbitMQ Connector Key: FLINK-5639 URL: https://issues.apache.org/jira/browse/FLINK-5639 Project: Flink Issue Type:

[CANCELLED][VOTE] Release Apache Flink 1.2.0 (RC1)

2017-01-25 Thread Robert Metzger
Thank you for checking the licenses and fixing it already Stephan! @Chesnay: I think we just have to mention it in the LICENSE file, then we are good to use it. *I will cancel this release candidate then and create RC2 at 6pm CET (roughly in 4 hours) today.* We can use these 4 hours to fix the

[jira] [Created] (FLINK-5641) Directories of expired checkpoints are not cleaned up

2017-01-25 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5641: - Summary: Directories of expired checkpoints are not cleaned up Key: FLINK-5641 URL: https://issues.apache.org/jira/browse/FLINK-5641 Project: Flink Issue

[jira] [Created] (FLINK-5642) queryable state: race condition with HeadListState

2017-01-25 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-5642: -- Summary: queryable state: race condition with HeadListState Key: FLINK-5642 URL: https://issues.apache.org/jira/browse/FLINK-5642 Project: Flink Issue Type: Bug

#quesiton: documentation for knn/examples/fetching cluster numbers in red?

2017-01-25 Thread Alex De Castro
Hi flinkers, I a new flink user and have been working on a pipeline from kafka to mongo that processes documents and cluster them using inn for information retrieval. I have a data set which is an array of sparse vectors, but I’m having a bit of a hard time interpreting the output of the knn

STREAM SQL inner queries

2017-01-25 Thread Radu Tudoran
Hi all, I would like to open a jira issue (and then provide the implementation) for supporting inner queries. The idea is to be able to support SQL queries as the ones presented in the scenarios below. The key idea is that supporting inner queries would require to have the implementation for:

[jira] [Created] (FLINK-5638) Deadlock when closing two chained async I/O operators

2017-01-25 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-5638: Summary: Deadlock when closing two chained async I/O operators Key: FLINK-5638 URL: https://issues.apache.org/jira/browse/FLINK-5638 Project: Flink Issue

Re: [VOTE] Release Apache Flink 1.2.0 (RC1)

2017-01-25 Thread Chesnay Schepler
Which dependency is MIT? On 25.01.2017 13:24, Stephan Ewen wrote: Did a License cross-check: All Maven Dependencies are okay. Added a small note on the RabbitMQ dependency's MPL 1.1 implications The Web UI added three new dependencies which are source-bundled since the last release - One

Re: [VOTE] Release Apache Flink 1.2.0 (RC1)

2017-01-25 Thread Chesnay Schepler
Hello, FLINK-5612 might be a blocker as well; basically if you use the GlobFilePathFilter in a job, chances are it will fail. Also, the lastCheckpointSize metric that we collect for each task was broken when the key-groups were introduced. Regards, Chesnay On 25.01.2017 10:52, Stefan

RE: [DISCUSS] Development of SQL OVER / Table API Row Windows for streaming tables

2017-01-25 Thread Radu Tudoran
Hi, I think it is a good idea to open these proposed Jira issues. I also like the approach to separate the usage of windows(globalwindows) and other operators based on the specific type of translation. One question regarding the design of these: should we build one translation rule for each

Re: [VOTE] Release Apache Flink 1.2.0 (RC1)

2017-01-25 Thread Stephan Ewen
Did a License cross-check: All Maven Dependencies are okay. Added a small note on the RabbitMQ dependency's MPL 1.1 implications The Web UI added three new dependencies which are source-bundled since the last release - One was added to the license file - One is ASL 2.0 (no license update

Re: [VOTE] Release Apache Flink 1.2.0 (RC1)

2017-01-25 Thread Stephan Ewen
The angular-drag-and-drop-list https://github.com/apache/flink/commit/34e106f63c9dcd2673d66b47fda1555b7dced770 On Wed, Jan 25, 2017 at 1:48 PM, Chesnay Schepler wrote: > Which dependency is MIT? > > > On 25.01.2017 13:24, Stephan Ewen wrote: > >> Did a License

[jira] [Created] (FLINK-5640) configure the explicit Unit Test file suffix

2017-01-25 Thread shijinkui (JIRA)
shijinkui created FLINK-5640: Summary: configure the explicit Unit Test file suffix Key: FLINK-5640 URL: https://issues.apache.org/jira/browse/FLINK-5640 Project: Flink Issue Type: Test

Re: [VOTE] Release Apache Flink 1.2.0 (RC1)

2017-01-25 Thread Stephan Ewen
@chesnay - we can keep the library, we just needed to update the license file and do a new release candidate. On Wed, Jan 25, 2017 at 1:59 PM, Chesnay Schepler wrote: > As far as i know this is only used on the metrics page, so we could remove > it temporarily until we found

[jira] [Created] (FLINK-5643) StateUtil.discardStateFuture fails when state future contains null value

2017-01-25 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-5643: Summary: StateUtil.discardStateFuture fails when state future contains null value Key: FLINK-5643 URL: https://issues.apache.org/jira/browse/FLINK-5643 Project:

[jira] [Created] (FLINK-5644) Task#lastCheckpointSize metric broken

2017-01-25 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-5644: --- Summary: Task#lastCheckpointSize metric broken Key: FLINK-5644 URL: https://issues.apache.org/jira/browse/FLINK-5644 Project: Flink Issue Type: Bug

Re: [VOTE] Release Apache Flink 1.2.0 (RC1)

2017-01-25 Thread Chesnay Schepler
As far as i know this is only used on the metrics page, so we could remove it temporarily until we found a replacement. This wouldn't remove functionality that existed before 1.2. On 25.01.2017 13:50, Stephan Ewen wrote: The angular-drag-and-drop-list

Re: Flink with Yarn on MapR

2017-01-25 Thread ani.desh1512
Robert Metzger wrote > The second one is tougher to fix. It seems that there is an issue with > loading the Hadoop configuration correctly. > Can you post the contents of the client log file from the "log/" > directory? > It contains for example the Hadoop version being used (Maybe it didn't >

[jira] [Created] (FLINK-5645) WebInterface does not show IO metrics for failed jobs

2017-01-25 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-5645: --- Summary: WebInterface does not show IO metrics for failed jobs Key: FLINK-5645 URL: https://issues.apache.org/jira/browse/FLINK-5645 Project: Flink

TestBaseUtils refactoring

2017-01-25 Thread Anton Solovev
Hello guys, I think there is a sense to tear apart TestBaseUtils into a number of util classes which it keeps them. Because some test classes can't extend this base test class, but need in its result checkers Best, Anton

[jira] [Created] (FLINK-5649) KryoException when starting job from Flink 1.2.0 rc0 savepoint

2017-01-25 Thread Scott Kidder (JIRA)
Scott Kidder created FLINK-5649: --- Summary: KryoException when starting job from Flink 1.2.0 rc0 savepoint Key: FLINK-5649 URL: https://issues.apache.org/jira/browse/FLINK-5649 Project: Flink

Flink 1.2.0 rc2 Bugs

2017-01-25 Thread Scott Kidder
I encountered the following two bugs while testing Flink 1.2.0 rc2: Task Manager ID missing from logs link in Job Manager UI https://issues.apache.org/jira/browse/FLINK-5648 KryoException when starting job from Flink 1.2.0 rc0 savepoint https://issues.apache.org/jira/browse/FLINK-5649 The last

Re: Flink with Yarn on MapR

2017-01-25 Thread Robert Metzger
Hi, no problem. Can you send me the log statements written to standard out on the client? On Wed, Jan 25, 2017 at 5:02 PM, ani.desh1512 wrote: > Robert Metzger wrote > > The second one is tougher to fix. It seems that there is an issue with > > loading the Hadoop

Re: flink-ml test

2017-01-25 Thread Anton Solovev
Thank you Theodore StochasticOutlierSelectionITSuite has failed twice, but at third time it’s okey, I can’t share logs it disappeared somewhere in travis 25.01.17, 18:27 пользователь "Theodore Vasiloudis" написал: Hello Anton, I usually run

[jira] [Created] (FLINK-5650) flink-python unit test costs more than half hour

2017-01-25 Thread shijinkui (JIRA)
shijinkui created FLINK-5650: Summary: flink-python unit test costs more than half hour Key: FLINK-5650 URL: https://issues.apache.org/jira/browse/FLINK-5650 Project: Flink Issue Type: Bug

Re: [DISCUSS] (Not) tagging reviewers

2017-01-25 Thread Jinkui Shi
Hi, all Thanks to the reviewers’ hard working. When a new issue or pull request, the issue provider hope can be reviewed quickly. So we usually “@somebody”, it’s only a request for quick review. Maybe we can identify whether this issue is indeed necessary or its obvious problem, so as to decide

Re: TestBaseUtils refactoring

2017-01-25 Thread Jark Wu
Hi Anton, Thanks for bringing up this discussion. I think TestBaseUtils is a util class not a base class to extend. All the methods in TestBaseUtils are static, that means we can use any method directly without extend it. Thanks, Jark > 在 2017年1月26日,上午7:46,Anton Solovev

Re: [DISCUSS] (Not) tagging reviewers

2017-01-25 Thread Haohui Mai
+1. Nice to be pinged if required, but relying on specific people to review simply does not scale. Popping up one level, occasionally the fact that people tag other people is because the PRs are left unreviewed for a while (or they believe the PRs will not be reviewed unless they explicitly ping

Re: [DISCUSS] Development of SQL OVER / Table API Row Windows for streaming tables

2017-01-25 Thread Jark Wu
Hi Fabian, I completely aggree with the six JIRAs and different runtime implementations. And I also aggree with @shaoxuan's proposal can work for both processing time and event time. Hi Shaoxuan, I really like the idea you proposed that using retraction to decrease computation. It's a great

re: STREAM SQL inner queries

2017-01-25 Thread Zhangrucong
Hi: The following syntax, I think it is sub query. In the calcite, it is optimal to semi-join. SELECT STREAM amount, (SELECT id FROM inputstream1) AS field1 FROM inputstream2 In my opinion, If the unbounded stream join the unbounded stream, there is no result. The stream may join with

Re: TestBaseUtils refactoring

2017-01-25 Thread Jinkui Shi
hi, all UT and CT have clear boundary and consensus of opinion: 1. UT and CT separated. surefire plugin have two different execution for CT and UT 2. Temporary file creating and destroy should be unified with TemporaryFolder. FLINK-5546 has change the java.io .tmpdir to

Re: STREAM SQL inner queries

2017-01-25 Thread Shaoxuan Wang
Hi Radu, Similar as the stream-stream join, this stream-stream inner query does not seem to be well defined. It needs provide at least some kind of window bounds to complete the streaming SQL semantics. If this is an unbounded join/select, a mechanism of how to store the infinite date has to be

Re: [DISCUSS] Development of SQL OVER / Table API Row Windows for streaming tables

2017-01-25 Thread Shaoxuan Wang
Yes Fabian, I will complete my design with more thorough thoughts. BTW, I think the incremental aggregate (the key point I suggested is to eliminate state per each window) I proposed should work for both processing time and event time. It just does not need a sorted state for the processing time

Re: [CANCELLED][VOTE] Release Apache Flink 1.2.0 (RC1)

2017-01-25 Thread Robert Metzger
Looks like nobody objected my plans :) I'm going to create a new RC for 1.2.0 now. On Wed, Jan 25, 2017 at 2:13 PM, Robert Metzger wrote: > Thank you for checking the licenses and fixing it already Stephan! > > @Chesnay: I think we just have to mention it in the LICENSE

[jira] [Created] (FLINK-5646) REST api documentation missing details on jar upload

2017-01-25 Thread Cliff Resnick (JIRA)
Cliff Resnick created FLINK-5646: Summary: REST api documentation missing details on jar upload Key: FLINK-5646 URL: https://issues.apache.org/jira/browse/FLINK-5646 Project: Flink Issue

Re: Flink with Yarn on MapR

2017-01-25 Thread ani.desh1512
Following is the output when I execute *./bin/yarn-session.sh -n 2* /2017-01-25 15:53:57,805 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: jobmanager.rpc.address, 10.101.2.111 2017-01-25 15:53:57,807 INFO

Re: [Proposal] RichFsSinkFunction

2017-01-25 Thread Fabian Hueske
Hi Seth, first of all, thanks for sharing your use case and your plan to contribute to Flink. I guess processing files as soon as they are completed is a very common use case. Publishing the paths of completed files as a data stream sounds like a nice idea. IMO, it would be great to have this

[jira] [Created] (FLINK-5647) Fix RocksDB Backend Cleanup

2017-01-25 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-5647: --- Summary: Fix RocksDB Backend Cleanup Key: FLINK-5647 URL: https://issues.apache.org/jira/browse/FLINK-5647 Project: Flink Issue Type: Bug

[VOTE] Release Apache Flink 1.2.0 (RC2)

2017-01-25 Thread Robert Metzger
Dear Flink community, Please vote on releasing the following candidate as Apache Flink version 1.2.0. The commit to be voted on: 8b5b6a8b (http://git-wip-us.apache.org/repos/asf/flink/commit/8b5b6a8b) Branch: release-1.2.0-rc2 (https://git1-us-west.apache.org/repos/asf/flink/repo?p=flin

Re: States split over to external storage

2017-01-25 Thread Chen Qin
Hi Stephan ,Fabian, Liuxin, Looks like it's already solved with dynamic scale keygroup along with incremental policy. Our use case is like a workflow model where high volume events (million TPS) & long holding window (24 hours), very small percentage of events will be forwarded to next operator.