Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/1208#issuecomment-146378690
  
    Review 2nd part (until Connectors):
    
    General & Outline
    - Inconsistent capitalization of sections
    - Switch "Execution Configuration" and "Data Sinks" sections to have Sinks 
directly after Sources?
    
    Specifying Keys
    - "A DataStream is keyed OAS"
    
    Data Types
    - and as results of transformations. -> and ON THE results of 
transformations.
    
    Data Sources (Java and Scala)
    - Creates a data set -> Creates a data stream (4 times)
    
    Debugging
    - Before running a streaming program on a large data set in a distributed 
cluster -> Before running a DataStream program in a distributed cluster ?
    
    Debugging - Local Execution Environment
    - breakpoint -> breakpointS
    
    Working with time
    - The event time paragraph says: "This time is either recorded in a 
timestamp embedded within the records before they enter Flink, or is assigned 
at the source." Isn't it Ingestion time if timestamps are assigned at the 
source?
    - Does ingestion time not require watermarks?
    - "In order to work with event time semantics, you need to follow three 
steps" isn't it four steps (last step are two steps)
    - What is the getCurrentWatermark() { return Long.MIN_VALUE;}  good for?
    
    Windows on keys data streams
    - Why do sliding windows "overlap by AT LEAST 4 secs / 900 elements"? 
Shouldn't it be: "subsequent windows overlap by 4 secs / 900 elements"?
    
    Advanced window constructs
    - "the SlidingTimeWindows assigner in the example" which example are you 
refering to? It should be size 5000 and slide of 1000 ms and not 1000 and 
100ms, right?
    - Delta trigger: the current element refers to the element that was last 
added to the window?
    
    Fault tolerance
    - where n is how often a checkpoint is taken in milliseconds. -> where n is 
the checkpointing interval in milliseconds ?
    - Is there no Kafka Sink (KafkaProducer)? It is missing in the sink 
guarantee table.
    
    Parallelism
    - Shouldn't we add here how data is repartitioned if the DOP of subsequent 
operators increases or decreases?
    
    Controlling latency
    - Also show the parameter for the buffer size.
    
    Working with state
    - The end effect is that updates to any form of state are the same under 
failure-free execution and execution under failures.  -> By that Flink 
guarantees that any form of state is the same under failure-free execution and 
execution under failures (given that the sources support exactly-once 
delivery). ?
    - The difference between both interfaces is not so well described, in my 
opinion.
    
    Making local variables consistent
    - There is a space in "c heckpoint"
    - "For example the same counting, reduce function shown for OperatorStates 
by using the Checkpointed interface instead" The Checkpointed interface is 
described before OperatorState
    
    Using the state interface
    - Doesn't the checkpointed interface require enableCheckpointing() to be 
set? It is only mentioned for the OperatorState.
    - Doesn't the Serializability requirement also apply to non-parallel state?
    - This section is hard to understand. Many different things being mentioned 
without clear structure.
    
    State checkpoints in iterative jobs
    - Fink -> Flink
    
    Iterations
    - Strem -> Stream


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to