Github user greghogan commented on a diff in the pull request:
https://github.com/apache/flink/pull/5277#discussion_r160793612
--- Diff: docs/concepts/runtime.md ---
@@ -88,40 +94,36 @@ By default, Flink allows subtasks to share slots even
if they are subtasks of di
they are from the same job. The result is that one slot may hold an entire
pipeline of the
job. Allowing this *slot sharing* has two main benefits:
- - A Flink cluster needs exactly as many task slots as the highest
parallelism used in the job.
- No need to calculate how many tasks (with varying parallelism) a
program contains in total.
+ - A Flink cluster needs as many task slots as the highest parallelism
used in the job.
+ There's no need to calculate how many tasks (with varying parallelism)
a program contains in total.
- It is easier to get better resource utilization. Without slot sharing,
the non-intensive
- *source/map()* subtasks would block as many resources as the resource
intensive *window* subtasks.
+ *source/map()* subtasks would block as many resources as the
resource-intensive *window* subtasks.
With slot sharing, increasing the base parallelism in our example from
two to six yields full utilization of the
- slotted resources, while making sure that the heavy subtasks are
fairly distributed among the TaskManagers.
+ slotted resources, while making sure that the heavy subtasks are
evenly distributed among the TaskManagers.
<img src="../fig/slot_sharing.svg" alt="TaskManagers with shared Task
Slots" class="offset" width="80%" />
-The APIs also include a *[resource
group](../dev/datastream_api.html#task-chaining-and-resource-groups)* mechanism
which can be used to prevent undesirable slot sharing.
+The APIs also include a *[resource
group](../dev/datastream_api.html#task-chaining-and-resource-groups)* mechanism
which you can use to prevent undesirable slot sharing.
-As a rule-of-thumb, a good default number of task slots would be the
number of CPU cores.
-With hyper-threading, each slot then takes 2 or more hardware thread
contexts.
+As a rule-of-thumb, a reasonable default number of task slots would be the
number of CPU cores. With hyper-threading, each slot then takes 2 or more
hardware thread contexts.
{% top %}
## State Backends
-The exact data structures in which the key/values indexes are stored
depends on the chosen [state backend](../ops/state/state_backends.html). One
state backend
-stores data in an in-memory hash map, another state backend uses
[RocksDB](http://rocksdb.org) as the key/value store.
-In addition to defining the data structure that holds the state, the state
backends also implement the logic to
-take a point-in-time snapshot of the key/value state and store that
snapshot as part of a checkpoint.
+The exact data structures which store the key/values indexes depends on
the chosen [state backend](../ops/state/state_backends.html). One state backend
stores data in an in-memory hash map, another state backend uses
[RocksDB](http://rocksdb.org) as the key/value store. In addition to defining
the data structure that holds the state, the state backends also implement the
logic to take a point-in-time snapshot of the key/value state and store that
snapshot as part of a checkpoint.
<img src="../fig/checkpoints.svg" alt="checkpoints and snapshots"
class="offset" width="60%" />
{% top %}
## Savepoints
-Programs written in the Data Stream API can resume execution from a
**savepoint**. Savepoints allow both updating your programs and your Flink
cluster without losing any state.
+Programs written in the Data Stream API can resume execution from a
**savepoint**. Savepoints allow both updating your programs and your Flink
cluster without losing any state.
-[Savepoints](../ops/state/savepoints.html) are **manually triggered
checkpoints**, which take a snapshot of the program and write it out to a state
backend. They rely on the regular checkpointing mechanism for this. During
execution programs are periodically snapshotted on the worker nodes and produce
checkpoints. For recovery only the last completed checkpoint is needed and
older checkpoints can be safely discarded as soon as a new one is completed.
+[Savepoints](../ops/state/savepoints.html) are **manually triggered
checkpoints**, which take a snapshot of the program and write it out to a state
backend. They rely on the regular checkpointing mechanism for this. During
execution, programs are periodically snapshotted on the worker nodes and
produce checkpoints. You only need the last completed checkpoint for recovery,
and you can safely discard older checkpoints as soon as a new one is completed.
-Savepoints are similar to these periodic checkpoints except that they are
**triggered by the user** and **don't automatically expire** when newer
checkpoints are completed. Savepoints can be created from the [command
line](../ops/cli.html#savepoints) or when cancelling a job via the [REST
API](../monitoring/rest_api.html#cancel-job-with-savepoint).
+Savepoints are similar to these periodic checkpoints except that they are
**triggered by the user** and **don't automatically expire** when newer
checkpoints are completed. You can create savepoints can from the [command
line](../ops/cli.html#savepoints) or when canceling a job via the [REST
API](../monitoring/rest_api.html#cancel-job-with-savepoint).
--- End diff --
"savepoints can" -> "savepoints"
---