[
https://issues.apache.org/jira/browse/FLINK-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947865#comment-14947865
]
ASF GitHub Bot commented on FLINK-2779:
---------------------------------------
Github user fhueske commented on the pull request:
https://github.com/apache/flink/pull/1208#issuecomment-146378690
Review 2nd part (until Connectors):
General & Outline
- Inconsistent capitalization of sections
- Switch "Execution Configuration" and "Data Sinks" sections to have Sinks
directly after Sources?
Specifying Keys
- "A DataStream is keyed OAS"
Data Types
- and as results of transformations. -> and ON THE results of
transformations.
Data Sources (Java and Scala)
- Creates a data set -> Creates a data stream (4 times)
Debugging
- Before running a streaming program on a large data set in a distributed
cluster -> Before running a DataStream program in a distributed cluster ?
Debugging - Local Execution Environment
- breakpoint -> breakpointS
Working with time
- The event time paragraph says: "This time is either recorded in a
timestamp embedded within the records before they enter Flink, or is assigned
at the source." Isn't it Ingestion time if timestamps are assigned at the
source?
- Does ingestion time not require watermarks?
- "In order to work with event time semantics, you need to follow three
steps" isn't it four steps (last step are two steps)
- What is the getCurrentWatermark() { return Long.MIN_VALUE;} good for?
Windows on keys data streams
- Why do sliding windows "overlap by AT LEAST 4 secs / 900 elements"?
Shouldn't it be: "subsequent windows overlap by 4 secs / 900 elements"?
Advanced window constructs
- "the SlidingTimeWindows assigner in the example" which example are you
refering to? It should be size 5000 and slide of 1000 ms and not 1000 and
100ms, right?
- Delta trigger: the current element refers to the element that was last
added to the window?
Fault tolerance
- where n is how often a checkpoint is taken in milliseconds. -> where n is
the checkpointing interval in milliseconds ?
- Is there no Kafka Sink (KafkaProducer)? It is missing in the sink
guarantee table.
Parallelism
- Shouldn't we add here how data is repartitioned if the DOP of subsequent
operators increases or decreases?
Controlling latency
- Also show the parameter for the buffer size.
Working with state
- The end effect is that updates to any form of state are the same under
failure-free execution and execution under failures. -> By that Flink
guarantees that any form of state is the same under failure-free execution and
execution under failures (given that the sources support exactly-once
delivery). ?
- The difference between both interfaces is not so well described, in my
opinion.
Making local variables consistent
- There is a space in "c heckpoint"
- "For example the same counting, reduce function shown for OperatorStates
by using the Checkpointed interface instead" The Checkpointed interface is
described before OperatorState
Using the state interface
- Doesn't the checkpointed interface require enableCheckpointing() to be
set? It is only mentioned for the OperatorState.
- Doesn't the Serializability requirement also apply to non-parallel state?
- This section is hard to understand. Many different things being mentioned
without clear structure.
State checkpoints in iterative jobs
- Fink -> Flink
Iterations
- Strem -> Stream
> Update documentation to reflect new Stream/Window API
> -----------------------------------------------------
>
> Key: FLINK-2779
> URL: https://issues.apache.org/jira/browse/FLINK-2779
> Project: Flink
> Issue Type: Sub-task
> Components: Streaming
> Reporter: Aljoscha Krettek
> Assignee: Kostas Tzoumas
> Fix For: 0.10
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)