Repository: flink
Updated Branches:
  refs/heads/master 4cc38fd36 -> 721220203


[FLINK-4654] [docs] Small improvements to the docs.

This closes #2525


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/72122020
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/72122020
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/72122020

Branch: refs/heads/master
Commit: 7212202036235f41e376872dc268735ba9ef81e9
Parents: 4cc38fd
Author: David Anderson <da...@alpinegizmo.com>
Authored: Tue Sep 20 14:54:44 2016 +0200
Committer: Greg Hogan <c...@greghogan.com>
Committed: Wed Sep 21 10:38:13 2016 -0400

----------------------------------------------------------------------
 docs/dev/datastream_api.md                |  6 +++---
 docs/dev/libs/cep.md                      |  9 +++++----
 docs/dev/state.md                         | 17 ++++++++---------
 docs/dev/state_backends.md                |  6 +++---
 docs/dev/windows.md                       |  4 ++--
 docs/quickstart/run_example_quickstart.md | 23 ++++++++++++-----------
 6 files changed, 33 insertions(+), 32 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/72122020/docs/dev/datastream_api.md
----------------------------------------------------------------------
diff --git a/docs/dev/datastream_api.md b/docs/dev/datastream_api.md
index 2dd7842..425dd6a 100644
--- a/docs/dev/datastream_api.md
+++ b/docs/dev/datastream_api.md
@@ -357,7 +357,7 @@ windowedStream.reduce (new 
ReduceFunction<Tuple2<String,Integer>() {
                The example function, when applied on the sequence (1,2,3,4,5),
                folds the sequence into the string "start-1-2-3-4-5":</p>
     {% highlight java %}
-windowedStream.fold("start-", new FoldFunction<Integer, String>() {
+windowedStream.fold("start", new FoldFunction<Integer, String>() {
     public String fold(String current, Integer value) {
         return current + "-" + value;
     }
@@ -1324,7 +1324,7 @@ File-based:
 
     *IMPORTANT NOTES:*
 
-    1. If the `watchType` is set to `FileProcessingMode.PROCESS_CONTINUOUSLY`, 
when a file is modified, its contents are re-processed entirely. This can brake 
the "exactly-once" semantics, as appending data at the end of a file will lead 
to **all** its contents being re-processed.
+    1. If the `watchType` is set to `FileProcessingMode.PROCESS_CONTINUOUSLY`, 
when a file is modified, its contents are re-processed entirely. This can break 
the "exactly-once" semantics, as appending data at the end of a file will lead 
to **all** its contents being re-processed.
 
     2. If the `watchType` is set to `FileProcessingMode.PROCESS_ONCE`, the 
source scans the path **once** and exits, without waiting for the readers to 
finish reading the file contents. Of course the readers will continue reading 
until all file contents are read. Closing the source leads to no more 
checkpoints after that point. This may lead to slower recovery after a node 
failure, as the job will resume reading from the last checkpoint.
 
@@ -1382,7 +1382,7 @@ File-based:
 
     *IMPORTANT NOTES:*
 
-    1. If the `watchType` is set to `FileProcessingMode.PROCESS_CONTINUOUSLY`, 
when a file is modified, its contents are re-processed entirely. This can brake 
the "exactly-once" semantics, as appending data at the end of a file will lead 
to **all** its contents being re-processed.
+    1. If the `watchType` is set to `FileProcessingMode.PROCESS_CONTINUOUSLY`, 
when a file is modified, its contents are re-processed entirely. This can break 
the "exactly-once" semantics, as appending data at the end of a file will lead 
to **all** its contents being re-processed.
 
     2. If the `watchType` is set to `FileProcessingMode.PROCESS_ONCE`, the 
source scans the path **once** and exits, without waiting for the readers to 
finish reading the file contents. Of course the readers will continue reading 
until all file contents are read. Closing the source leads to no more 
checkpoints after that point. This may lead to slower recovery after a node 
failure, as the job will resume reading from the last checkpoint.
 

http://git-wip-us.apache.org/repos/asf/flink/blob/72122020/docs/dev/libs/cep.md
----------------------------------------------------------------------
diff --git a/docs/dev/libs/cep.md b/docs/dev/libs/cep.md
index 77266bc..d27cf9f 100644
--- a/docs/dev/libs/cep.md
+++ b/docs/dev/libs/cep.md
@@ -98,7 +98,7 @@ val result: DataStream[Alert] = 
patternStream.select(createAlert(_))
 </div>
 </div>
 
-Note that we use use Java 8 lambdas in our Java code examples to make them 
more succinct.
+Note that we use Java 8 lambdas in our Java code examples to make them more 
succinct.
 
 ## The Pattern API
 
@@ -521,10 +521,11 @@ def flatSelectFn(pattern : mutable.Map[String, IN], 
collector : Collector[OUT])
 
 ### Handling Timed Out Partial Patterns
 
-Whenever a pattern has a window length associated via the `within` key word, 
it is possible that partial event patterns will be discarded because they 
exceed the window length.
-In order to react to these timeout events the `select` and `flatSelect` API 
calls allow to specify a timeout handler.
+Whenever a pattern has a window length associated via the `within` keyword, it 
is possible that partial event patterns will be discarded because they exceed 
the window length.
+In order to react to these timeout events the `select` and `flatSelect` API 
calls allow a timeout handler to be specified.
 This timeout handler is called for each partial event pattern which has timed 
out.
-The timeout handler receives all so far matched events of the partial pattern 
and the timestamp when the timeout was detected.
+The timeout handler receives all the events that have been matched so far by 
the pattern, and the timestamp when the timeout was detected.
+
 
 <div class="codetabs" markdown="1">
 <div data-lang="java" markdown="1">

http://git-wip-us.apache.org/repos/asf/flink/blob/72122020/docs/dev/state.md
----------------------------------------------------------------------
diff --git a/docs/dev/state.md b/docs/dev/state.md
index ec8c5eb..37de0a8 100644
--- a/docs/dev/state.md
+++ b/docs/dev/state.md
@@ -73,20 +73,19 @@ active key (i.e. the key of the input element).
 It is important to keep in mind that these state objects are only used for 
interfacing
 with state. The state is not necessarily stored inside but might reside on 
disk or somewhere else.
 The second thing to keep in mind is that the value you get from the state
-depend on the key of the input element. So the value you get in one invocation 
of your
-user function can be different from the one you get in another invocation if 
the key of
-the element is different.
+depends on the key of the input element. So the value you get in one 
invocation of your
+user function can differ from the value in another invocation if the keys 
involved are different.
 
-To get a state handle you have to create a `StateDescriptor` this holds the 
name of the state
+To get a state handle you have to create a `StateDescriptor`. This holds the 
name of the state
 (as we will later see you can create several states, and they have to have 
unique names so
-that you can reference them), the type of the values that the state holds and 
possibly
+that you can reference them), the type of the values that the state holds, and 
possibly
 a user-specified function, such as a `ReduceFunction`. Depending on what type 
of state you
-want to retrieve you create one of `ValueStateDescriptor`, 
`ListStateDescriptor` or
-`ReducingStateDescriptor`.
+want to retrieve, you create either a `ValueStateDescriptor`, a 
`ListStateDescriptor` or
+a `ReducingStateDescriptor`.
 
 State is accessed using the `RuntimeContext`, so it is only possible in *rich 
functions*.
 Please see [here]({{ site.baseurl 
}}/apis/common/#specifying-transformation-functions) for
-information about that but we will also see an example shortly. The 
`RuntimeContext` that
+information about that, but we will also see an example shortly. The 
`RuntimeContext` that
 is available in a `RichFunction` has these methods for accessing state:
 
 * `ValueState<T> getState(ValueStateDescriptor<T>)`
@@ -147,7 +146,7 @@ env.fromElements(Tuple2.of(1L, 3L), Tuple2.of(1L, 5L), 
Tuple2.of(1L, 7L), Tuple2
 
 This example implements a poor man's counting window. We key the tuples by the 
first field
 (in the example all have the same key `1`). The function stores the count and 
a running sum in
-a `ValueState`, once the count reaches 2 it will emit the average and clear 
the state so that
+a `ValueState`. Once the count reaches 2 it will emit the average and clear 
the state so that
 we start over from `0`. Note that this would keep a different state value for 
each different input
 key if we had tuples with different values in the first field.
 

http://git-wip-us.apache.org/repos/asf/flink/blob/72122020/docs/dev/state_backends.md
----------------------------------------------------------------------
diff --git a/docs/dev/state_backends.md b/docs/dev/state_backends.md
index e5b9c2a..70e472d 100644
--- a/docs/dev/state_backends.md
+++ b/docs/dev/state_backends.md
@@ -41,16 +41,16 @@ chosen **State Backend**.
 
 Out of the box, Flink bundles these state backends:
 
- - *MemoryStateBacked*
+ - *MemoryStateBackend*
  - *FsStateBackend*
  - *RocksDBStateBackend*
 
-If nothing else is configured, the system will use the MemoryStateBacked.
+If nothing else is configured, the system will use the MemoryStateBackend.
 
 
 ### The MemoryStateBackend
 
-The *MemoryStateBacked* holds data internally as objects on the Java heap. 
Key/value state and window operators hold hash tables
+The *MemoryStateBackend* holds data internally as objects on the Java heap. 
Key/value state and window operators hold hash tables
 that store the values, triggers, etc.
 
 Upon checkpoints, this state backend will snapshot the state and send it as 
part of the checkpoint acknowledgement messages to the

http://git-wip-us.apache.org/repos/asf/flink/blob/72122020/docs/dev/windows.md
----------------------------------------------------------------------
diff --git a/docs/dev/windows.md b/docs/dev/windows.md
index 67280dd..2611870 100644
--- a/docs/dev/windows.md
+++ b/docs/dev/windows.md
@@ -111,7 +111,7 @@ which we could process the aggregated elements.
 ### Tumbling Windows
 
 A *tumbling windows* assigner assigns elements to fixed length, 
non-overlapping windows of a
-specified *window size*.. For example, if you specify a window size of 5 
minutes, the window
+specified *window size*. For example, if you specify a window size of 5 
minutes, the window
 function will get 5 minutes worth of elements in each invocation.
 
 <img src="{{ site.baseurl }}/fig/tumbling-windows.svg" class="center" 
style="width: 80%;" />
@@ -381,7 +381,7 @@ a concatenation of all the `Long` fields of the input.
 
 ### WindowFunction - The Generic Case
 
-Using a `WindowFunction` provides most flexibility, at the cost of 
performance. The reason for this
+Using a `WindowFunction` provides the most flexibility, at the cost of 
performance. The reason for this
 is that elements cannot be incrementally aggregated for a window and instead 
need to be buffered
 internally until the window is considered ready for processing. A 
`WindowFunction` gets an
 `Iterable` containing all the elements of the window being processed. The 
signature of

http://git-wip-us.apache.org/repos/asf/flink/blob/72122020/docs/quickstart/run_example_quickstart.md
----------------------------------------------------------------------
diff --git a/docs/quickstart/run_example_quickstart.md 
b/docs/quickstart/run_example_quickstart.md
index 70f8756..e079280 100644
--- a/docs/quickstart/run_example_quickstart.md
+++ b/docs/quickstart/run_example_quickstart.md
@@ -26,12 +26,12 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-In this guide we will start from scratch and go from setting up a Flink 
project and running
+In this guide we will start from scratch and go from setting up a Flink 
project to running
 a streaming analysis program on a Flink cluster.
 
 Wikipedia provides an IRC channel where all edits to the wiki are logged. We 
are going to
 read this channel in Flink and count the number of bytes that each user edits 
within
-a given window of time. This is easy enough to implement in a few minutes 
using Flink but it will
+a given window of time. This is easy enough to implement in a few minutes 
using Flink, but it will
 give you a good foundation from which to start building more complex analysis 
programs on your own.
 
 ## Setting up a Maven Project
@@ -125,21 +125,21 @@ public class WikipediaAnalysis {
 }
 {% endhighlight %}
 
-I admit it's very bare bones now but we will fill it as we go. Note, that I'll 
not give
+The program is very basic now, but we will fill it in as we go. Note that I'll 
not give
 import statements here since IDEs can add them automatically. At the end of 
this section I'll show
 the complete code with import statements if you simply want to skip ahead and 
enter that in your
 editor.
 
 The first step in a Flink program is to create a `StreamExecutionEnvironment`
 (or `ExecutionEnvironment` if you are writing a batch job). This can be used 
to set execution
-parameters and create sources for reading from external systems. So let's go 
ahead, add
+parameters and create sources for reading from external systems. So let's go 
ahead and add
 this to the main method:
 
 {% highlight java %}
 StreamExecutionEnvironment see = 
StreamExecutionEnvironment.getExecutionEnvironment();
 {% endhighlight %}
 
-Next, we will create a source that reads from the Wikipedia IRC log:
+Next we will create a source that reads from the Wikipedia IRC log:
 
 {% highlight java %}
 DataStream<WikipediaEditEvent> edits = see.addSource(new 
WikipediaEditsSource());
@@ -149,7 +149,7 @@ This creates a `DataStream` of `WikipediaEditEvent` 
elements that we can further
 the purposes of this example we are interested in determining the number of 
added or removed
 bytes that each user causes in a certain time window, let's say five seconds. 
For this we first
 have to specify that we want to key the stream on the user name, that is to 
say that operations
-on this should take the key into account. In our case the summation of edited 
bytes in the windows
+on this stream should take the user name into account. In our case the 
summation of edited bytes in the windows
 should be per unique user. For keying a Stream we have to provide a 
`KeySelector`, like this:
 
 {% highlight java %}
@@ -165,8 +165,8 @@ KeyedStream<WikipediaEditEvent, String> keyedEdits = edits
 This gives us a Stream of `WikipediaEditEvent` that has a `String` key, the 
user name.
 We can now specify that we want to have windows imposed on this stream and 
compute a
 result based on elements in these windows. A window specifies a slice of a 
Stream
-on which to perform a computation. They are required when performing an 
aggregation
-computation on an infinite stream of elements. In our example we will say
+on which to perform a computation. Windows are required when computing 
aggregations
+on an infinite stream of elements. In our example we will say
 that we want to aggregate the sum of edited bytes for every five seconds:
 
 {% highlight java %}
@@ -276,9 +276,10 @@ similar to this:
 The number in front of each line tells you on which parallel instance of the 
print sink the output
 was produced.
 
-This should get you started with writing your own Flink programs. You can 
check out our guides
-about [basic concepts]{{{ site.baseurl }}/apis/common/index.html} and the
-[DataStream API]{{{ site.baseurl }}/apis/streaming/index.html} if you want to 
learn more. Stick
+This should get you started with writing your own Flink programs. To learn 
more 
+you can check out our guides
+about [basic concepts]({{ site.baseurl }}/apis/common/index.html) and the
+[DataStream API]({{ site.baseurl }}/apis/streaming/index.html). Stick
 around for the bonus exercise if you want to learn about setting up a Flink 
cluster on
 your own machine and writing results to [Kafka](http://kafka.apache.org).
 

Reply via email to