Github user ChrisChinchilla commented on a diff in the pull request:
https://github.com/apache/flink/pull/5045#discussion_r153374312
--- Diff: docs/concepts/programming-model.md ---
@@ -33,53 +33,52 @@ Flink offers different levels of abstraction to develop
streaming/batch applicat
<img src="../fig/levels_of_abstraction.svg" alt="Programming levels of
abstraction" class="offset" width="80%" />
- - The lowest level abstraction simply offers **stateful streaming**. It
is embedded into the [DataStream API](../dev/datastream_api.html)
- via the [Process
Function](../dev/stream/operators/process_function.html). It allows users
freely process events from one or more streams,
- and use consistent fault tolerant *state*. In addition, users can
register event time and processing time callbacks,
+ - The lowest level abstraction offers **stateful streaming** and is
embedded into the [DataStream API](../dev/datastream_api.html)
+ via the [Process
Function](../dev/stream/operators/process_function.html). It allows users to
process events from one or more streams,
+ and use consistent fault tolerant *state*. Users can register event
time and processing time callbacks,
allowing programs to realize sophisticated computations.
- - In practice, most applications would not need the above described low
level abstraction, but would instead program against the
+ - In practice, most applications would not need the low level
abstraction describe above, but would instead program against the
**Core APIs** like the [DataStream API](../dev/datastream_api.html)
(bounded/unbounded streams) and the [DataSet API](../dev/batch/index.html)
- (bounded data sets). These fluent APIs offer the common building
blocks for data processing, like various forms of user-specified
+ (bounded data sets). These fluent APIs offer the common building
blocks for data processing, like forms of user-specified
transformations, joins, aggregations, windows, state, etc. Data types
processed in these APIs are represented as classes
- in the respective programming languages.
+ in respective programming languages.
- The low level *Process Function* integrates with the *DataStream API*,
making it possible to go the lower level abstraction
- for certain operations only. The *DataSet API* offers additional
primitives on bounded data sets, like loops/iterations.
+ The low level *Process Function* integrates with the *DataStream API*,
making it possible to use the lower level abstraction
+ for certain operations. The *DataSet API* offers additional primitives
on bounded data sets, like loops or iterations.
- The **Table API** is a declarative DSL centered around *tables*, which
may be dynamically changing tables (when representing streams).
- The [Table API](../dev/table_api.html) follows the (extended)
relational model: Tables have a schema attached (similar to tables in
relational databases)
+ The [Table API](../dev/table_api.html) follows the (extended)
relational model. Tables have a schema attached (similar to tables in
relational databases)
and the API offers comparable operations, such as select, project,
join, group-by, aggregate, etc.
- Table API programs declaratively define *what logical operation should
be done* rather than specifying exactly
- *how the code for the operation looks*. Though the Table API is
extensible by various types of user-defined
+ Table API programs declaratively define *what logical operation should
to perform* rather than specifying
+ *how the code for the operation looks*. The Table API is extensible by
various types of user-defined
functions, it is less expressive than the *Core APIs*, but more
concise to use (less code to write).
- In addition, Table API programs also go through an optimizer that
applies optimization rules before execution.
+ Table API programs also go through an optimizer that applies
optimization rules before execution.
- One can seamlessly convert between tables and *DataStream*/*DataSet*,
allowing programs to mix *Table API* and with the *DataStream*
+ You can seamlessly convert between tables and *DataStream*/*DataSet*,
allowing programs to mix *Table API* and with the *DataStream*
and *DataSet* APIs.
- The highest level abstraction offered by Flink is **SQL**. This
abstraction is similar to the *Table API* both in semantics and
expressiveness, but represents programs as SQL query expressions.
- The [SQL](../dev/table_api.html#sql) abstraction closely interacts
with the Table API, and SQL queries can be executed over tables defined in the
*Table API*.
+ The [SQL](../dev/table_api.html#sql) abstraction closely interacts
with the Table API, and you can execute SQL queries over tables defined in the
*Table API*.
## Programs and Dataflows
-The basic building blocks of Flink programs are **streams** and
**transformations**. (Note that the
-DataSets used in Flink's DataSet API are also streams internally -- more
about that
-later.) Conceptually a *stream* is a (potentially never-ending) flow of
data records, and a *transformation* is an
+The basic building blocks of Flink programs are **streams** and
**transformations**. The
+DataSets used in Flink's DataSet API are also streams internally, which
this document will cover later. Conceptually a *stream* is a (potentially
never-ending) flow of data records, and a *transformation* is an
--- End diff --
@greghogan I'm personally not a fan of hard line breaks, but happy to stick
to them, but I'm struggling to figure out what the character limit for the
project is, I can't see anything in any style files and it's somewhat varied
throughout the docs. If there's a solid number then I'll add to the
contributors guide.
---