Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-146826450
I will merge this now since the code is also changed. We can iterate on
this if need be.
---
If your project is set up for it, you can reply to this email and have
Github user ktzoumas closed the pull request at:
https://github.com/apache/flink/pull/1208
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-146829953
@asfgit didn't close it automatically, could you please close it @ktzoumas .
---
If your project is set up for it, you can reply to this email and have your
reply
Github user ktzoumas commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-146484570
Thanks a lot Fabian for the fantastic review! I addressed most of your
comments. The state section needs to be rewritten after
Github user mbalassi commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-146542207
Thanks for this big update, guys!
Some comments:
* Explanation of operator chaining and options are missing.
* Collection Data Sources: The
Github user ktzoumas commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-146576052
@mbalassi: is the operator chaining currently documented somewhere? Same
about partitioning between operators of different parallelism
---
If your project is set up
Github user ktzoumas commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-146624231
Added docs for chaining and resource groups
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user ktzoumas commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-146203981
Once https://github.com/apache/flink/pull/1238 is merged, this is ready to
be merged as well (the docs track the changes in that PR)
---
If your project is set up for
Github user fhueske commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-146264681
It's a bit painful to review because Github is not showing the diff :-(
Here is what I found in `streaming_guide.ml` until but not including
`Specifying Keys`:
Github user ktzoumas commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145936851
I addressed the comments above, added the content for Scala, reworked the
iterations section, added content to the windows section, and made the
DataStream API guide
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145013573
Two comments:
- I would call the section "Buffer Timeout" rather "Controlling Latency"
or so. It helps people that are interested in latency to find the
Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145019903
The WindowFunction has a different signature for regular windows and all
windows. This should maybe be visible in the transformations section.
---
If your project is
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145024860
That is what I mean, global time reduce.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145024435
Global windows are not parallel, not in any system, it is inherent in the
operation.
You can pre-aggregate in parallel, if the windows are time windows.
Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145017010
Sometimes you write DataSet in the stream guide where it should be
DataStream.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145017991
The description of Evictors for the keyed windows are a little bit messy,
what gets evicted from where. There is also some conflict with the example and
the description:
Github user senorcarbone commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145032778
Ah Indeed. I just reviewed this with the prospect of custom
pre-aggregations in mind and it seems like pre-aggregation strategies operate
on bucket-granularity.
Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145017812
In the section "Transformations" you sometimes use the `` syntax to
highlight words, such as DataStream. Inside the table they don't get properly
translated, however.
Github user senorcarbone commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145025577
Nice doc so far ^^
One tiny fix on the *Advanced window constructs* subsection:
[0,1000], [100,1100], **[200,1200]**, ..., [1000, 2000]
---
If
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145025216
Would be a nice addition to the docs, to state that global time reduce (and
other aggregations) are pre-aggregated in parallel.
---
If your project is set up for
Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145021034
There should be some streaming specific stuff in the `Execution Config`
section. For example, there is `enableTimestamps()` and
`setAutoWatermarkInterval()`.
---
If
Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145029475
@senorcarbone It doesn't restrict. You can use an assigner that assigns to
one single window. Then using, trigger and evictor you can implement everything
that was
Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145020080
@gyfora Yes, they work with the same semantics. What do you mean by "fix
time trigger/eviction determined by the window assigner"?
---
If your project is set up for
Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145020330
@gyfora, yes, thats right
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145020261
@aljoscha : what I meant was just that we define some time semantics with
the window assigner
---
If your project is set up for it, you can reply to this email and have
Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145020475
The ConnectedStreams example formatting is off (in Transformations
section), also there should be something like:
```java
ConnectedStreams<> connectedStreams =
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145020604
"Windows on unkeyed data streams (non-parallel windows)"
I still think that this gives a bad false impression of executing all
global windows including time in a
Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145020534
If split is mentioned select should also be mentioned.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well.
Github user senorcarbone commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-145028978
From a quick read on the documentation (and prior knowledge from google
dataflow) it is easy to get a full picture of the new semantics. Even though I
like it from
29 matches
Mail list logo