[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-09 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-146826450 I will merge this now since the code is also changed. We can iterate on this if need be. --- If your project is set up for it, you can reply to this email and have

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-09 Thread ktzoumas
Github user ktzoumas closed the pull request at: https://github.com/apache/flink/pull/1208 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-09 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-146829953 @asfgit didn't close it automatically, could you please close it @ktzoumas . --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-08 Thread ktzoumas
Github user ktzoumas commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-146484570 Thanks a lot Fabian for the fantastic review! I addressed most of your comments. The state section needs to be rewritten after

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-08 Thread mbalassi
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-146542207 Thanks for this big update, guys! Some comments: * Explanation of operator chaining and options are missing. * Collection Data Sources: The

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-08 Thread ktzoumas
Github user ktzoumas commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-146576052 @mbalassi: is the operator chaining currently documented somewhere? Same about partitioning between operators of different parallelism --- If your project is set up

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-08 Thread ktzoumas
Github user ktzoumas commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-146624231 Added docs for chaining and resource groups --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-07 Thread ktzoumas
Github user ktzoumas commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-146203981 Once https://github.com/apache/flink/pull/1238 is merged, this is ready to be merged as well (the docs track the changes in that PR) --- If your project is set up for

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-07 Thread fhueske
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-146264681 It's a bit painful to review because Github is not showing the diff :-( Here is what I found in `streaming_guide.ml` until but not including `Specifying Keys`:

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-06 Thread ktzoumas
Github user ktzoumas commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145936851 I addressed the comments above, added the content for Scala, reworked the iterations section, added content to the windows section, and made the DataStream API guide

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145013573 Two comments: - I would call the section "Buffer Timeout" rather "Controlling Latency" or so. It helps people that are interested in latency to find the

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145019903 The WindowFunction has a different signature for regular windows and all windows. This should maybe be visible in the transformations section. --- If your project is

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145024860 That is what I mean, global time reduce. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145024435 Global windows are not parallel, not in any system, it is inherent in the operation. You can pre-aggregate in parallel, if the windows are time windows.

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145017010 Sometimes you write DataSet in the stream guide where it should be DataStream. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145017991 The description of Evictors for the keyed windows are a little bit messy, what gets evicted from where. There is also some conflict with the example and the description:

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread senorcarbone
Github user senorcarbone commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145032778 Ah Indeed. I just reviewed this with the prospect of custom pre-aggregations in mind and it seems like pre-aggregation strategies operate on bucket-granularity.

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145017812 In the section "Transformations" you sometimes use the `` syntax to highlight words, such as DataStream. Inside the table they don't get properly translated, however.

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread senorcarbone
Github user senorcarbone commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145025577 Nice doc so far ^^ One tiny fix on the *Advanced window constructs* subsection: [0,1000], [100,1100], **[200,1200]**, ..., [1000, 2000] --- If

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread StephanEwen
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145025216 Would be a nice addition to the docs, to state that global time reduce (and other aggregations) are pre-aggregated in parallel. --- If your project is set up for

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145021034 There should be some streaming specific stuff in the `Execution Config` section. For example, there is `enableTimestamps()` and `setAutoWatermarkInterval()`. --- If

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145029475 @senorcarbone It doesn't restrict. You can use an assigner that assigns to one single window. Then using, trigger and evictor you can implement everything that was

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145020080 @gyfora Yes, they work with the same semantics. What do you mean by "fix time trigger/eviction determined by the window assigner"? --- If your project is set up for

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145020330 @gyfora, yes, thats right --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145020261 @aljoscha : what I meant was just that we define some time semantics with the window assigner --- If your project is set up for it, you can reply to this email and have

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145020475 The ConnectedStreams example formatting is off (in Transformations section), also there should be something like: ```java ConnectedStreams<> connectedStreams =

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread gyfora
Github user gyfora commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145020604 "Windows on unkeyed data streams (non-parallel windows)" I still think that this gives a bad false impression of executing all global windows including time in a

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread aljoscha
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145020534 If split is mentioned select should also be mentioned. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] flink pull request: [FLINK-2779][FLINK-2794][streaming][docs] New ...

2015-10-02 Thread senorcarbone
Github user senorcarbone commented on the pull request: https://github.com/apache/flink/pull/1208#issuecomment-145028978 From a quick read on the documentation (and prior knowledge from google dataflow) it is easy to get a full picture of the new semantics. Even though I like it from