[09/15] flink git commit: [FLINK-3132] [docs] Initial docs restructure

uce Fri, 15 Jan 2016 07:50:09 -0800

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/examples.md
----------------------------------------------------------------------
diff --git a/docs/apis/examples.md b/docs/apis/examples.md
deleted file mode 100644
index a11ed9c..0000000
--- a/docs/apis/examples.md
+++ /dev/null
@@ -1,516 +0,0 @@
----
-title:  "Bundled Examples"
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-The following example programs showcase different applications of Flink 
-from simple word counting to graph algorithms. The code samples illustrate the 
-use of [Flink's API](programming_guide.html). 
-
-The full source code of the following and more examples can be found in the 
__flink-java-examples__
-or __flink-scala-examples__ module of the Flink source repository.
-
-* This will be replaced by the TOC
-{:toc}
-
-
-## Running an example
-
-In order to run a Flink example, we assume you have a running Flink instance 
available. The "Setup" tab in the navigation describes various ways of starting 
Flink.
-
-The easiest way is running the `./bin/start-local.sh` script, which will start 
a JobManager locally.
-
-Each binary release of Flink contains an `examples` directory with jar files 
for each of the examples on this page.
-
-To run the WordCount example, issue the following command:
-
-~~~bash
-./bin/flink run ./examples/batch/WordCount.jar
-~~~
-
-The other examples can be started in a similar way.
-
-Note that many examples run without passing any arguments for them, by using 
build-in data. To run WordCount with real data, you have to pass the path to 
the data:
-
-~~~bash
-./bin/flink run ./examples/batch/WordCount.jar /path/to/some/text/data 
/path/to/result
-~~~
-
-Note that non-local file systems require a schema prefix, such as `hdfs://`.
-
-
-## Word Count
-WordCount is the "Hello World" of Big Data processing systems. It computes the 
frequency of words in a text collection. The algorithm works in two steps: 
First, the texts are splits the text to individual words. Second, the words are 
grouped and counted.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-~~~java
-ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
-
-DataSet<String> text = env.readTextFile("/path/to/file"); 
-
-DataSet<Tuple2<String, Integer>> counts = 
-        // split up the lines in pairs (2-tuples) containing: (word,1)
-        text.flatMap(new Tokenizer())
-        // group by the tuple field "0" and sum up tuple field "1"
-        .groupBy(0)
-        .sum(1);
-
-counts.writeAsCsv(outputPath, "\n", " ");
-
-// User-defined functions
-public static class Tokenizer implements FlatMapFunction<String, 
Tuple2<String, Integer>> {
-
-    @Override
-    public void flatMap(String value, Collector<Tuple2<String, Integer>> out) {
-        // normalize and split the line
-        String[] tokens = value.toLowerCase().split("\\W+");
-        
-        // emit the pairs
-        for (String token : tokens) {
-            if (token.length() > 0) {
-                out.collect(new Tuple2<String, Integer>(token, 1));
-            }   
-        }
-    }
-}
-~~~
-
-The {% gh_link 
/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/wordcount/WordCount.java
  "WordCount example" %} implements the above described algorithm with input 
parameters: `<text input path>, <output path>`. As test data, any text file 
will do.
-
-</div>
-<div data-lang="scala" markdown="1">
-
-~~~scala
-val env = ExecutionEnvironment.getExecutionEnvironment
-
-// get input data
-val text = env.readTextFile("/path/to/file")
-
-val counts = text.flatMap { _.toLowerCase.split("\\W+") filter { _.nonEmpty } }
-  .map { (_, 1) }
-  .groupBy(0)
-  .sum(1)
-
-counts.writeAsCsv(outputPath, "\n", " ")
-~~~
-
-The {% gh_link 
/flink-examples/flink-scala-examples/src/main/scala/org/apache/flink/examples/scala/wordcount/WordCount.scala
  "WordCount example" %} implements the above described algorithm with input 
parameters: `<text input path>, <output path>`. As test data, any text file 
will do.
-
-
-</div>
-</div>
-
-## Page Rank
-
-The PageRank algorithm computes the "importance" of pages in a graph defined 
by links, which point from one pages to another page. It is an iterative graph 
algorithm, which means that it repeatedly applies the same computation. In each 
iteration, each page distributes its current rank over all its neighbors, and 
compute its new rank as a taxed sum of the ranks it received from its 
neighbors. The PageRank algorithm was popularized by the Google search engine 
which uses the importance of webpages to rank the results of search queries.
-
-In this simple example, PageRank is implemented with a [bulk 
iteration](iterations.html) and a fixed number of iterations.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-~~~java
-ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
-
-// read the pages and initial ranks by parsing a CSV file
-DataSet<Tuple2<Long, Double>> pagesWithRanks = env.readCsvFile(pagesInputPath)
-                                                  .types(Long.class, 
Double.class)
-
-// the links are encoded as an adjacency list: (page-id, Array(neighbor-ids))
-DataSet<Tuple2<Long, Long[]>> pageLinkLists = getLinksDataSet(env);
-
-// set iterative data set
-IterativeDataSet<Tuple2<Long, Double>> iteration = 
pagesWithRanks.iterate(maxIterations);
-
-DataSet<Tuple2<Long, Double>> newRanks = iteration
-        // join pages with outgoing edges and distribute rank
-        .join(pageLinkLists).where(0).equalTo(0).flatMap(new 
JoinVertexWithEdgesMatch())
-        // collect and sum ranks
-        .groupBy(0).sum(1)
-        // apply dampening factor
-        .map(new Dampener(DAMPENING_FACTOR, numPages));
-
-DataSet<Tuple2<Long, Double>> finalPageRanks = iteration.closeWith(
-        newRanks, 
-        newRanks.join(iteration).where(0).equalTo(0)
-        // termination condition
-        .filter(new EpsilonFilter()));
-
-finalPageRanks.writeAsCsv(outputPath, "\n", " ");
-
-// User-defined functions
-
-public static final class JoinVertexWithEdgesMatch 
-                    implements FlatJoinFunction<Tuple2<Long, Double>, 
Tuple2<Long, Long[]>, 
-                                            Tuple2<Long, Double>> {
-
-    @Override
-    public void join(<Tuple2<Long, Double> page, Tuple2<Long, Long[]> adj, 
-                        Collector<Tuple2<Long, Double>> out) {
-        Long[] neigbors = adj.f1;
-        double rank = page.f1;
-        double rankToDistribute = rank / ((double) neigbors.length);
-            
-        for (int i = 0; i < neigbors.length; i++) {
-            out.collect(new Tuple2<Long, Double>(neigbors[i], 
rankToDistribute));
-        }
-    }
-}
-
-public static final class Dampener implements MapFunction<Tuple2<Long,Double>, 
Tuple2<Long,Double>> {
-    private final double dampening, randomJump;
-
-    public Dampener(double dampening, double numVertices) {
-        this.dampening = dampening;
-        this.randomJump = (1 - dampening) / numVertices;
-    }
-
-    @Override
-    public Tuple2<Long, Double> map(Tuple2<Long, Double> value) {
-        value.f1 = (value.f1 * dampening) + randomJump;
-        return value;
-    }
-}
-
-public static final class EpsilonFilter 
-                implements FilterFunction<Tuple2<Tuple2<Long, Double>, 
Tuple2<Long, Double>>> {
-
-    @Override
-    public boolean filter(Tuple2<Tuple2<Long, Double>, Tuple2<Long, Double>> 
value) {
-        return Math.abs(value.f0.f1 - value.f1.f1) > EPSILON;
-    }
-}
-~~~
-
-The {% gh_link 
/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/graph/PageRankBasic.java
 "PageRank program" %} implements the above example.
-It requires the following parameters to run: `<pages input path>, <links input 
path>, <output path>, <num pages>, <num iterations>`.
-
-</div>
-<div data-lang="scala" markdown="1">
-
-~~~scala
-// User-defined types
-case class Link(sourceId: Long, targetId: Long)
-case class Page(pageId: Long, rank: Double)
-case class AdjacencyList(sourceId: Long, targetIds: Array[Long])
-
-// set up execution environment
-val env = ExecutionEnvironment.getExecutionEnvironment
-
-// read the pages and initial ranks by parsing a CSV file
-val pages = env.readCsvFile[Page](pagesInputPath)
-
-// the links are encoded as an adjacency list: (page-id, Array(neighbor-ids))
-val links = env.readCsvFile[Link](linksInputPath)
-
-// assign initial ranks to pages
-val pagesWithRanks = pages.map(p => Page(p, 1.0 / numPages))
-
-// build adjacency list from link input
-val adjacencyLists = links
-  // initialize lists
-  .map(e => AdjacencyList(e.sourceId, Array(e.targetId)))
-  // concatenate lists
-  .groupBy("sourceId").reduce {
-  (l1, l2) => AdjacencyList(l1.sourceId, l1.targetIds ++ l2.targetIds)
-  }
-
-// start iteration
-val finalRanks = pagesWithRanks.iterateWithTermination(maxIterations) {
-  currentRanks =>
-    val newRanks = currentRanks
-      // distribute ranks to target pages
-      .join(adjacencyLists).where("pageId").equalTo("sourceId") {
-        (page, adjacent, out: Collector[Page]) =>
-        for (targetId <- adjacent.targetIds) {
-          out.collect(Page(targetId, page.rank / adjacent.targetIds.length))
-        }
-      }
-      // collect ranks and sum them up
-      .groupBy("pageId").aggregate(SUM, "rank")
-      // apply dampening factor
-      .map { p =>
-        Page(p.pageId, (p.rank * DAMPENING_FACTOR) + ((1 - DAMPENING_FACTOR) / 
numPages))
-      }
-
-    // terminate if no rank update was significant
-    val termination = 
currentRanks.join(newRanks).where("pageId").equalTo("pageId") {
-      (current, next, out: Collector[Int]) =>
-        // check for significant update
-        if (math.abs(current.rank - next.rank) > EPSILON) out.collect(1)
-    }
-
-    (newRanks, termination)
-}
-
-val result = finalRanks
-
-// emit result
-result.writeAsCsv(outputPath, "\n", " ")
-~~~
-
-he {% gh_link 
/flink-examples/flink-scala-examples/src/main/scala/org/apache/flink/examples/scala/graph/PageRankBasic.scala
 "PageRank program" %} implements the above example.
-It requires the following parameters to run: `<pages input path>, <links input 
path>, <output path>, <num pages>, <num iterations>`.
-</div>
-</div>
-
-Input files are plain text files and must be formatted as follows:
-- Pages represented as an (long) ID separated by new-line characters.
-    * For example `"1\n2\n12\n42\n63\n"` gives five pages with IDs 1, 2, 12, 
42, and 63.
-- Links are represented as pairs of page IDs which are separated by space 
characters. Links are separated by new-line characters:
-    * For example `"1 2\n2 12\n1 12\n42 63\n"` gives four (directed) links 
(1)->(2), (2)->(12), (1)->(12), and (42)->(63).
-
-For this simple implementation it is required that each page has at least one 
incoming and one outgoing link (a page can point to itself).
-
-## Connected Components
-
-The Connected Components algorithm identifies parts of a larger graph which 
are connected by assigning all vertices in the same connected part the same 
component ID. Similar to PageRank, Connected Components is an iterative 
algorithm. In each step, each vertex propagates its current component ID to all 
its neighbors. A vertex accepts the component ID from a neighbor, if it is 
smaller than its own component ID.
-
-This implementation uses a [delta iteration](iterations.html): Vertices that 
have not changed their component ID do not participate in the next step. This 
yields much better performance, because the later iterations typically deal 
only with a few outlier vertices.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-~~~java
-// read vertex and edge data
-DataSet<Long> vertices = getVertexDataSet(env);
-DataSet<Tuple2<Long, Long>> edges = getEdgeDataSet(env).flatMap(new 
UndirectEdge());
-
-// assign the initial component IDs (equal to the vertex ID)
-DataSet<Tuple2<Long, Long>> verticesWithInitialId = vertices.map(new 
DuplicateValue<Long>());
-        
-// open a delta iteration
-DeltaIteration<Tuple2<Long, Long>, Tuple2<Long, Long>> iteration =
-        verticesWithInitialId.iterateDelta(verticesWithInitialId, 
maxIterations, 0);
-
-// apply the step logic: 
-DataSet<Tuple2<Long, Long>> changes = iteration.getWorkset()
-        // join with the edges
-        .join(edges).where(0).equalTo(0).with(new 
NeighborWithComponentIDJoin())
-        // select the minimum neighbor component ID
-        .groupBy(0).aggregate(Aggregations.MIN, 1)
-        // update if the component ID of the candidate is smaller
-        .join(iteration.getSolutionSet()).where(0).equalTo(0)
-        .flatMap(new ComponentIdFilter());
-
-// close the delta iteration (delta and new workset are identical)
-DataSet<Tuple2<Long, Long>> result = iteration.closeWith(changes, changes);
-
-// emit result
-result.writeAsCsv(outputPath, "\n", " ");
-
-// User-defined functions
-
-public static final class DuplicateValue<T> implements MapFunction<T, 
Tuple2<T, T>> {
-    
-    @Override
-    public Tuple2<T, T> map(T vertex) {
-        return new Tuple2<T, T>(vertex, vertex);
-    }
-}
-
-public static final class UndirectEdge 
-                    implements FlatMapFunction<Tuple2<Long, Long>, 
Tuple2<Long, Long>> {
-    Tuple2<Long, Long> invertedEdge = new Tuple2<Long, Long>();
-    
-    @Override
-    public void flatMap(Tuple2<Long, Long> edge, Collector<Tuple2<Long, Long>> 
out) {
-        invertedEdge.f0 = edge.f1;
-        invertedEdge.f1 = edge.f0;
-        out.collect(edge);
-        out.collect(invertedEdge);
-    }
-}
-
-public static final class NeighborWithComponentIDJoin 
-                implements JoinFunction<Tuple2<Long, Long>, Tuple2<Long, 
Long>, Tuple2<Long, Long>> {
-
-    @Override
-    public Tuple2<Long, Long> join(Tuple2<Long, Long> vertexWithComponent, 
Tuple2<Long, Long> edge) {
-        return new Tuple2<Long, Long>(edge.f1, vertexWithComponent.f1);
-    }
-}
-
-public static final class ComponentIdFilter 
-                    implements FlatMapFunction<Tuple2<Tuple2<Long, Long>, 
Tuple2<Long, Long>>, 
-                                            Tuple2<Long, Long>> {
-
-    @Override
-    public void flatMap(Tuple2<Tuple2<Long, Long>, Tuple2<Long, Long>> value, 
-                        Collector<Tuple2<Long, Long>> out) {
-        if (value.f0.f1 < value.f1.f1) {
-            out.collect(value.f0);
-        }
-    }
-}
-~~~
-
-The {% gh_link 
/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/graph/ConnectedComponents.java
 "ConnectedComponents program" %} implements the above example. It requires the 
following parameters to run: `<vertex input path>, <edge input path>, <output 
path> <max num iterations>`.
-
-</div>
-<div data-lang="scala" markdown="1">
-
-~~~scala
-// set up execution environment
-val env = ExecutionEnvironment.getExecutionEnvironment
-
-// read vertex and edge data
-// assign the initial components (equal to the vertex id)
-val vertices = getVerticesDataSet(env).map { id => (id, id) }
-
-// undirected edges by emitting for each input edge the input edges itself and 
an inverted
-// version
-val edges = getEdgesDataSet(env).flatMap { edge => Seq(edge, (edge._2, 
edge._1)) }
-
-// open a delta iteration
-val verticesWithComponents = vertices.iterateDelta(vertices, maxIterations, 
Array(0)) {
-  (s, ws) =>
-
-    // apply the step logic: join with the edges
-    val allNeighbors = ws.join(edges).where(0).equalTo(0) { (vertex, edge) =>
-      (edge._2, vertex._2)
-    }
-
-    // select the minimum neighbor
-    val minNeighbors = allNeighbors.groupBy(0).min(1)
-
-    // update if the component of the candidate is smaller
-    val updatedComponents = minNeighbors.join(s).where(0).equalTo(0) {
-      (newVertex, oldVertex, out: Collector[(Long, Long)]) =>
-        if (newVertex._2 < oldVertex._2) out.collect(newVertex)
-    }
-
-    // delta and new workset are identical
-    (updatedComponents, updatedComponents)
-}
-
-verticesWithComponents.writeAsCsv(outputPath, "\n", " ")
-    
-~~~
-
-The {% gh_link 
/flink-examples/flink-scala-examples/src/main/scala/org/apache/flink/examples/scala/graph/ConnectedComponents.scala
 "ConnectedComponents program" %} implements the above example. It requires the 
following parameters to run: `<vertex input path>, <edge input path>, <output 
path> <max num iterations>`.
-</div>
-</div>
-
-Input files are plain text files and must be formatted as follows:
-- Vertices represented as IDs and separated by new-line characters.
-    * For example `"1\n2\n12\n42\n63\n"` gives five vertices with (1), (2), 
(12), (42), and (63).
-- Edges are represented as pairs for vertex IDs which are separated by space 
characters. Edges are separated by new-line characters:
-    * For example `"1 2\n2 12\n1 12\n42 63\n"` gives four (undirected) links 
(1)-(2), (2)-(12), (1)-(12), and (42)-(63).
-
-## Relational Query
-
-The Relational Query example assumes two tables, one with `orders` and the 
other with `lineitems` as specified by the [TPC-H decision support 
benchmark](http://www.tpc.org/tpch/). TPC-H is a standard benchmark in the 
database industry. See below for instructions how to generate the input data.
-
-The example implements the following SQL query.
-
-~~~sql
-SELECT l_orderkey, o_shippriority, sum(l_extendedprice) as revenue
-    FROM orders, lineitem
-WHERE l_orderkey = o_orderkey
-    AND o_orderstatus = "F" 
-    AND YEAR(o_orderdate) > 1993
-    AND o_orderpriority LIKE "5%"
-GROUP BY l_orderkey, o_shippriority;
-~~~
-
-The Flink program, which implements the above query looks as follows.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-~~~java
-// get orders data set: (orderkey, orderstatus, orderdate, orderpriority, 
shippriority)
-DataSet<Tuple5<Integer, String, String, String, Integer>> orders = 
getOrdersDataSet(env);
-// get lineitem data set: (orderkey, extendedprice)
-DataSet<Tuple2<Integer, Double>> lineitems = getLineitemDataSet(env);
-
-// orders filtered by year: (orderkey, custkey)
-DataSet<Tuple2<Integer, Integer>> ordersFilteredByYear =
-        // filter orders
-        orders.filter(
-            new FilterFunction<Tuple5<Integer, String, String, String, 
Integer>>() {
-                @Override
-                public boolean filter(Tuple5<Integer, String, String, String, 
Integer> t) {
-                    // status filter
-                    if(!t.f1.equals(STATUS_FILTER)) {
-                        return false;
-                    // year filter
-                    } else if(Integer.parseInt(t.f2.substring(0, 4)) <= 
YEAR_FILTER) {
-                        return false;
-                    // order priority filter
-                    } else if(!t.f3.startsWith(OPRIO_FILTER)) {
-                        return false;
-                    }
-                    return true;
-                }
-            })
-        // project fields out that are no longer required
-        .project(0,4).types(Integer.class, Integer.class);
-
-// join orders with lineitems: (orderkey, shippriority, extendedprice)
-DataSet<Tuple3<Integer, Integer, Double>> lineitemsOfOrders = 
-        ordersFilteredByYear.joinWithHuge(lineitems)
-                            .where(0).equalTo(0)
-                            .projectFirst(0,1).projectSecond(1)
-                            .types(Integer.class, Integer.class, Double.class);
-
-// extendedprice sums: (orderkey, shippriority, sum(extendedprice))
-DataSet<Tuple3<Integer, Integer, Double>> priceSums = 
-        // group by order and sum extendedprice
-        lineitemsOfOrders.groupBy(0,1).aggregate(Aggregations.SUM, 2);
-
-// emit result
-priceSums.writeAsCsv(outputPath);
-~~~
-
-The {% gh_link 
/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/TPCHQuery10.java
 "Relational Query program" %} implements the above query. It requires the 
following parameters to run: `<orders input path>, <lineitem input path>, 
<output path>`.
-
-</div>
-<div data-lang="scala" markdown="1">
-Coming soon...
-
-The {% gh_link 
/flink-examples/flink-scala-examples/src/main/scala/org/apache/flink/examples/scala/relational/TPCHQuery3.scala
 "Relational Query program" %} implements the above query. It requires the 
following parameters to run: `<orders input path>, <lineitem input path>, 
<output path>`.
-
-</div>
-</div>
-
-The orders and lineitem files can be generated using the [TPC-H 
benchmark](http://www.tpc.org/tpch/) suite's data generator tool (DBGEN). 
-Take the following steps to generate arbitrary large input files for the 
provided Flink programs:
-
-1.  Download and unpack DBGEN
-2.  Make a copy of *makefile.suite* called *Makefile* and perform the 
following changes:
-
-~~~bash
-DATABASE = DB2
-MACHINE  = LINUX
-WORKLOAD = TPCH
-CC       = gcc
-~~~
-
-1.  Build DBGEN using *make*
-2.  Generate lineitem and orders relations using dbgen. A scale factor
-    (-s) of 1 results in a generated data set with about 1 GB size.
-
-~~~bash
-./dbgen -T o -s 1
-~~~


http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/fault_tolerance.md
----------------------------------------------------------------------
diff --git a/docs/apis/fault_tolerance.md b/docs/apis/fault_tolerance.md
deleted file mode 100644
index 677ff95..0000000
--- a/docs/apis/fault_tolerance.md
+++ /dev/null
@@ -1,265 +0,0 @@
----
-title: "Fault Tolerance"
-is_beta: false
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-<a href="#top"></a>
-
-Flink's fault tolerance mechanism recovers programs in the presence of 
failures and
-continues to execute them. Such failures include machine hardware failures, 
network failures,
-transient program failures, etc.
-
-* This will be replaced by the TOC
-{:toc}
-
-
-Streaming Fault Tolerance (DataStream API)
-------------------------------------------
-
-Flink has a checkpointing mechanism that recovers streaming jobs after 
failues. The checkpointing mechanism requires a *persistent* (or *durable*) 
source that
-can be asked for prior records again (Apache Kafka is a good example of such a 
source).
-
-The checkpointing mechanism stores the progress in the data sources and data 
sinks, the state of windows, as well as the user-defined state (see [Working 
with State]({{ site.baseurl }}/apis/streaming_guide.html#working-with-state)) 
consistently to provide *exactly once* processing semantics. Where the 
checkpoints are stored (e.g., JobManager memory, file system, database) depends 
on the configured [state backend]({{ site.baseurl }}/apis/state_backends.html).
-
-The [docs on streaming fault tolerance]({{ site.baseurl 
}}/internals/stream_checkpointing.html) describe in detail the technique behind 
Flink's streaming fault tolerance mechanism.
-
-To enable checkpointing, call `enableCheckpointing(n)` on the 
`StreamExecutionEnvironment`, where *n* is the checkpoint interval in 
milliseconds.
-
-Other parameters for checkpointing include:
-
-- *Number of retries*: The `setNumberOfExecutionRerties()` method defines how 
many times the job is restarted after a failure.
-  When checkpointing is activated, but this value is not explicitly set, the 
job is restarted infinitely often.
-
-- *exactly-once vs. at-least-once*: You can optionally pass a mode to the 
`enableCheckpointing(n)` method to choose between the two guarantee levels.
-  Exactly-once is preferrable for most applications. At-least-once may be 
relevant for certain super-low-latency (consistently few milliseconds) 
applications.
-
-- *number of concurrent checkpoints*: By default, the system will not trigger 
another checkpoint while one is still in progress. This ensures that the 
topology does not spend too much time on checkpoints and not make progress with 
processing the streams. It is possible to allow for multiple overlapping 
checkpoints, which is interesting for pipelines that have a certain processing 
delay (for example because the functions call external services that need some 
time to respond) but that still want to do very frequent checkpoints (100s of 
milliseconds) to re-process very little upon failures.
-
-- *checkpoint timeout*: The time after which a checkpoint-in-progress is 
aborted, if it did not complete until then.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
-
-// start a checkpoint every 1000 ms
-env.enableCheckpointing(1000);
-
-// advanced options:
-
-// set mode to exactly-once (this is the default)
-env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
-
-// checkpoints have to complete within one minute, or are discarded
-env.getCheckpointConfig().setCheckpointTimeout(60000);
-
-// allow only one checkpoint to be in progress at the same time
-env.getCheckpointConfig().setMaxConcurrentCheckpoints(1);
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val env = StreamExecutionEnvironment.getExecutionEnvironment()
-
-// start a checkpoint every 1000 ms
-env.enableCheckpointing(1000)
-
-// advanced options:
-
-// set mode to exactly-once (this is the default)
-env.getCheckpointConfig.setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE)
-
-// checkpoints have to complete within one minute, or are discarded
-env.getCheckpointConfig.setCheckpointTimeout(60000)
-
-// allow only one checkpoint to be in progress at the same time
-env.getCheckpointConfig.setMaxConcurrentCheckpoints(1)
-{% endhighlight %}
-</div>
-</div>
-
-
-### Fault Tolerance Guarantees of Data Sources and Sinks
-
-Flink can guarantee exactly-once state updates to user-defined state only when 
the source participates in the 
-snapshotting mechanism. This is currently guaranteed for the Kafka source (and 
internal number generators), but
-not for other sources. The following table lists the state update guarantees 
of Flink coupled with the bundled sources:
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 25%">Source</th>
-      <th class="text-left" style="width: 25%">Guarantees</th>
-      <th class="text-left">Notes</th>
-    </tr>
-   </thead>
-   <tbody>
-        <tr>
-            <td>Apache Kafka</td>
-            <td>exactly once</td>
-            <td>Use the appropriate Kafka connector for your version</td>
-        </tr>
-        <tr>
-            <td>RabbitMQ</td>
-            <td>at most once (v 0.10) / exactly once (v 1.0) </td>
-            <td></td>
-        </tr>
-        <tr>
-            <td>Twitter Streaming API</td>
-            <td>at most once</td>
-            <td></td>
-        </tr>
-        <tr>
-            <td>Collections</td>
-            <td>exactly once</td>
-            <td></td>
-        </tr>
-        <tr>
-            <td>Files</td>
-            <td>at least once</td>
-            <td>At failure the file will be read from the beginning</td>
-        </tr>
-        <tr>
-            <td>Sockets</td>
-            <td>at most once</td>
-            <td></td>
-        </tr>
-  </tbody>
-</table>
-
-To guarantee end-to-end exactly-once record delivery (in addition to 
exactly-once state semantics), the data sink needs
-to take part in the checkpointing mechanism. The following table lists the 
delivery guarantees (assuming exactly-once 
-state updates) of Flink coupled with bundled sinks:
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 25%">Sink</th>
-      <th class="text-left" style="width: 25%">Guarantees</th>
-      <th class="text-left">Notes</th>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-        <td>HDFS rolling sink</td>
-        <td>exactly once</td>
-        <td>Implementation depends on Hadoop version</td>
-    </tr>
-    <tr>
-        <td>Elasticsearch</td>
-        <td>at least once</td>
-        <td></td>
-    </tr>
-    <tr>
-        <td>Kafka producer</td>
-        <td>at least once</td>
-        <td></td>
-    </tr>
-    <tr>
-        <td>File sinks</td>
-        <td>at least once</td>
-        <td></td>
-    </tr>
-    <tr>
-        <td>Socket sinks</td>
-        <td>at least once</td>
-        <td></td>
-    </tr>
-    <tr>
-        <td>Standard output</td>
-        <td>at least once</td>
-        <td></td>
-    </tr>
-  </tbody>
-</table>
-
-[Back to top](#top)
-
-
-Batch Processing Fault Tolerance (DataSet API)
-----------------------------------------------
-
-Fault tolerance for programs in the *DataSet API* works by retrying failed 
executions.
-The number of time that Flink retries the execution before the job is declared 
as failed is configurable
-via the *execution retries* parameter. A value of *0* effectively means that 
fault tolerance is deactivated.
-
-To activate the fault tolerance, set the *execution retries* to a value larger 
than zero. A common choice is a value
-of three.
-
-This example shows how to configure the execution retries for a Flink DataSet 
program.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
-env.setNumberOfExecutionRetries(3);
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val env = ExecutionEnvironment.getExecutionEnvironment()
-env.setNumberOfExecutionRetries(3)
-{% endhighlight %}
-</div>
-</div>
-
-
-You can also define default values for the number of execution retries and the 
retry delay in the `flink-conf.yaml`:
-
-~~~
-execution-retries.default: 3
-~~~
-
-
-Retry Delays
-------------
-
-Execution retries can be configured to be delayed. Delaying the retry means 
that after a failed execution, the re-execution does not start
-immediately, but only after a certain delay.
-
-Delaying the retries can be helpful when the program interacts with external 
systems where for example connections or pending transactions should reach a 
timeout before re-execution is attempted.
-
-You can set the retry delay for each program as follows (the sample shows the 
DataStream API - the DataSet API works similarly):
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
-env.getConfig().setExecutionRetryDelay(5000); // 5000 milliseconds delay
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val env = StreamExecutionEnvironment.getExecutionEnvironment()
-env.getConfig.setExecutionRetryDelay(5000) // 5000 milliseconds delay
-{% endhighlight %}
-</div>
-</div>
-
-You can also define the default value for the retry delay in the 
`flink-conf.yaml`:
-
-~~~
-execution-retries.delay: 10 s
-~~~
-
-[Back to top](#top)
-
-

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/fig/LICENSE.txt
----------------------------------------------------------------------
diff --git a/docs/apis/fig/LICENSE.txt b/docs/apis/fig/LICENSE.txt
deleted file mode 100644
index 35b8673..0000000
--- a/docs/apis/fig/LICENSE.txt
+++ /dev/null
@@ -1,17 +0,0 @@
-All image files in the folder and its subfolders are
-licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/fig/iterations_delta_iterate_operator.png
----------------------------------------------------------------------
diff --git a/docs/apis/fig/iterations_delta_iterate_operator.png 
b/docs/apis/fig/iterations_delta_iterate_operator.png
deleted file mode 100644
index 470485a..0000000
Binary files a/docs/apis/fig/iterations_delta_iterate_operator.png and 
/dev/null differ

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/fig/iterations_delta_iterate_operator_example.png
----------------------------------------------------------------------
diff --git a/docs/apis/fig/iterations_delta_iterate_operator_example.png 
b/docs/apis/fig/iterations_delta_iterate_operator_example.png
deleted file mode 100644
index 15f2b54..0000000
Binary files a/docs/apis/fig/iterations_delta_iterate_operator_example.png and 
/dev/null differ

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/fig/iterations_iterate_operator.png
----------------------------------------------------------------------
diff --git a/docs/apis/fig/iterations_iterate_operator.png 
b/docs/apis/fig/iterations_iterate_operator.png
deleted file mode 100644
index aaf4158..0000000
Binary files a/docs/apis/fig/iterations_iterate_operator.png and /dev/null 
differ

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/fig/iterations_iterate_operator_example.png
----------------------------------------------------------------------
diff --git a/docs/apis/fig/iterations_iterate_operator_example.png 
b/docs/apis/fig/iterations_iterate_operator_example.png
deleted file mode 100644
index be4841c..0000000
Binary files a/docs/apis/fig/iterations_iterate_operator_example.png and 
/dev/null differ

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/fig/iterations_supersteps.png
----------------------------------------------------------------------
diff --git a/docs/apis/fig/iterations_supersteps.png 
b/docs/apis/fig/iterations_supersteps.png
deleted file mode 100644
index 331dbc7..0000000
Binary files a/docs/apis/fig/iterations_supersteps.png and /dev/null differ

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/fig/plan_visualizer.png
----------------------------------------------------------------------
diff --git a/docs/apis/fig/plan_visualizer.png 
b/docs/apis/fig/plan_visualizer.png
deleted file mode 100644
index 85b8c55..0000000
Binary files a/docs/apis/fig/plan_visualizer.png and /dev/null differ

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/fig/savepoints-overview.png
----------------------------------------------------------------------
diff --git a/docs/apis/fig/savepoints-overview.png 
b/docs/apis/fig/savepoints-overview.png
deleted file mode 100644
index c0e7563..0000000
Binary files a/docs/apis/fig/savepoints-overview.png and /dev/null differ

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/fig/savepoints-program_ids.png
----------------------------------------------------------------------
diff --git a/docs/apis/fig/savepoints-program_ids.png 
b/docs/apis/fig/savepoints-program_ids.png
deleted file mode 100644
index cc161ef..0000000
Binary files a/docs/apis/fig/savepoints-program_ids.png and /dev/null differ

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/filesystems.md
----------------------------------------------------------------------
diff --git a/docs/apis/filesystems.md b/docs/apis/filesystems.md
new file mode 100644
index 0000000..e100cdd
--- /dev/null
+++ b/docs/apis/filesystems.md
@@ -0,0 +1,236 @@
+---
+title: "File Systems"
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 9
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Reading from file systems.
+
+Flink has build-in support for the following file systems:
+
+| Filesystem                            | Scheme       | Notes  |
+| ------------------------------------- |--------------| ------ |
+| Hadoop Distributed File System (HDFS) &nbsp; | `hdfs://`    | All HDFS 
versions are supported |
+| Amazon S3                             | `s3://`      | Support through 
Hadoop file system implementation (see below) |
+| MapR file system                      | `maprfs://`  | The user has to 
manually place the required jar files in the `lib/` dir |
+| Tachyon                               | `tachyon://` &nbsp; | Support 
through Hadoop file system implementation (see below) |
+
+
+
+### Using Hadoop file system implementations
+
+Apache Flink allows users to use any file system implementing the 
`org.apache.hadoop.fs.FileSystem`
+interface. There are Hadoop `FileSystem` implementations for
+
+- [S3](https://aws.amazon.com/s3/) (tested)
+- [Google Cloud Storage Connector for 
Hadoop](https://cloud.google.com/hadoop/google-cloud-storage-connector) (tested)
+- [Tachyon](http://tachyon-project.org/) (tested)
+- [XtreemFS](http://www.xtreemfs.org/) (tested)
+- FTP via [Hftp](http://hadoop.apache.org/docs/r1.2.1/hftp.html) (not tested)
+- and many more.
+
+In order to use a Hadoop file system with Flink, make sure that
+
+- the `flink-conf.yaml` has set the `fs.hdfs.hadoopconf` property set to the 
Hadoop configuration directory.
+- the Hadoop configuration (in that directory) has an entry for the required 
file system. Examples for S3 and Tachyon are shown below.
+- the required classes for using the file system are available in the `lib/` 
folder of the Flink installation (on all machines running Flink). If putting 
the files into the directory is not possible, Flink is also respecting the 
`HADOOP_CLASSPATH` environment variable to add Hadoop jar files to the 
classpath.
+
+#### Amazon S3
+
+For Amazon S3 support add the following entries into the `core-site.xml` file:
+
+~~~xml
+<!-- configure the file system implementation -->
+<property>
+  <name>fs.s3.impl</name>
+  <value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
+</property>
+
+<!-- set your AWS ID -->
+<property>
+  <name>fs.s3.awsAccessKeyId</name>
+  <value>putKeyHere</value>
+</property>
+
+<!-- set your AWS access key -->
+<property>
+  <name>fs.s3.awsSecretAccessKey</name>
+  <value>putSecretHere</value>
+</property>
+~~~
+
+#### Tachyon
+
+For Tachyon support add the following entry into the `core-site.xml` file:
+
+~~~xml
+<property>
+  <name>fs.tachyon.impl</name>
+  <value>tachyon.hadoop.TFS</value>
+</property>
+~~~
+
+
+## Connecting to other systems using Input/OutputFormat wrappers for Hadoop
+
+Apache Flink allows users to access many different systems as data sources or 
sinks.
+The system is designed for very easy extensibility. Similar to Apache Hadoop, 
Flink has the concept
+of so called `InputFormat`s and `OutputFormat`s.
+
+One implementation of these `InputFormat`s is the `HadoopInputFormat`. This is 
a wrapper that allows
+users to use all existing Hadoop input formats with Flink.
+
+This section shows some examples for connecting Flink to other systems.
+[Read more about Hadoop compatibility in Flink](hadoop_compatibility.html).
+
+## Avro support in Flink
+
+Flink has extensive build-in support for [Apache 
Avro](http://avro.apache.org/). This allows to easily read from Avro files with 
Flink.
+Also, the serialization framework of Flink is able to handle classes generated 
from Avro schemas.
+
+In order to read data from an Avro file, you have to specify an 
`AvroInputFormat`.
+
+**Example**:
+
+~~~java
+AvroInputFormat<User> users = new AvroInputFormat<User>(in, User.class);
+DataSet<User> usersDS = env.createInput(users);
+~~~
+
+Note that `User` is a POJO generated by Avro. Flink also allows to perform 
string-based key selection of these POJOs. For example:
+
+~~~java
+usersDS.groupBy("name")
+~~~
+
+
+Note that using the `GenericData.Record` type is possible with Flink, but not 
recommended. Since the record contains the full schema, its very data intensive 
and thus probably slow to use.
+
+Flink's POJO field selection also works with POJOs generated from Avro. 
However, the usage is only possible if the field types are written correctly to 
the generated class. If a field is of type `Object` you can not use the field 
as a join or grouping key.
+Specifying a field in Avro like this `{"name": "type_double_test", "type": 
"double"},` works fine, however specifying it as a UNION-type with only one 
field (`{"name": "type_double_test", "type": ["double"]},`) will generate a 
field of type `Object`. Note that specifying nullable types (`{"name": 
"type_double_test", "type": ["null", "double"]},`) is possible!
+
+
+
+### Access Microsoft Azure Table Storage
+
+_Note: This example works starting from Flink 0.6-incubating_
+
+This example is using the `HadoopInputFormat` wrapper to use an existing 
Hadoop input format implementation for accessing [Azure's Table 
Storage](https://azure.microsoft.com/en-us/documentation/articles/storage-introduction/).
+
+1. Download and compile the `azure-tables-hadoop` project. The input format 
developed by the project is not yet available in Maven Central, therefore, we 
have to build the project ourselves.
+Execute the following commands:
+
+   ~~~bash
+   git clone https://github.com/mooso/azure-tables-hadoop.git
+   cd azure-tables-hadoop
+   mvn clean install
+   ~~~
+
+2. Setup a new Flink project using the quickstarts:
+
+   ~~~bash
+   curl https://flink.apache.org/q/quickstart.sh | bash
+   ~~~
+
+3. Add the following dependencies (in the `<dependencies>` section) to your 
`pom.xml` file:
+
+   ~~~xml
+   <dependency>
+       <groupId>org.apache.flink</groupId>
+       <artifactId>flink-hadoop-compatibility</artifactId>
+       <version>{{site.version}}</version>
+   </dependency>
+   <dependency>
+     <groupId>com.microsoft.hadoop</groupId>
+     <artifactId>microsoft-hadoop-azure</artifactId>
+     <version>0.0.4</version>
+   </dependency>
+   ~~~
+
+   `flink-hadoop-compatibility` is a Flink package that provides the Hadoop 
input format wrappers.
+   `microsoft-hadoop-azure` is adding the project we've build before to our 
project.
+
+The project is now prepared for starting to code. We recommend to import the 
project into an IDE, such as Eclipse or IntelliJ. (Import as a Maven project!).
+Browse to the code of the `Job.java` file. Its an empty skeleton for a Flink 
job.
+
+Paste the following code into it:
+
+~~~java
+import java.util.Map;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.hadoopcompatibility.mapreduce.HadoopInputFormat;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapreduce.Job;
+import com.microsoft.hadoop.azure.AzureTableConfiguration;
+import com.microsoft.hadoop.azure.AzureTableInputFormat;
+import com.microsoft.hadoop.azure.WritableEntity;
+import com.microsoft.windowsazure.storage.table.EntityProperty;
+
+public class AzureTableExample {
+
+  public static void main(String[] args) throws Exception {
+    // set up the execution environment
+    final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+
+    // create a  AzureTableInputFormat, using a Hadoop input format wrapper
+    HadoopInputFormat<Text, WritableEntity> hdIf = new HadoopInputFormat<Text, 
WritableEntity>(new AzureTableInputFormat(), Text.class, WritableEntity.class, 
new Job());
+
+    // set the Account URI, something like: 
https://apacheflink.table.core.windows.net
+    
hdIf.getConfiguration().set(AzureTableConfiguration.Keys.ACCOUNT_URI.getKey(), 
"TODO");
+    // set the secret storage key here
+    
hdIf.getConfiguration().set(AzureTableConfiguration.Keys.STORAGE_KEY.getKey(), 
"TODO");
+    // set the table name here
+    
hdIf.getConfiguration().set(AzureTableConfiguration.Keys.TABLE_NAME.getKey(), 
"TODO");
+
+    DataSet<Tuple2<Text, WritableEntity>> input = env.createInput(hdIf);
+    // a little example how to use the data in a mapper.
+    DataSet<String> fin = input.map(new 
MapFunction<Tuple2<Text,WritableEntity>, String>() {
+      @Override
+      public String map(Tuple2<Text, WritableEntity> arg0) throws Exception {
+        System.err.println("--------------------------------\nKey = "+arg0.f0);
+        WritableEntity we = arg0.f1;
+
+        for(Map.Entry<String, EntityProperty> prop : 
we.getProperties().entrySet()) {
+          System.err.println("key="+prop.getKey() + " ; value 
(asString)="+prop.getValue().getValueAsString());
+        }
+
+        return arg0.f0.toString();
+      }
+    });
+
+    // emit result (this works only locally)
+    fin.print();
+
+    // execute program
+    env.execute("Azure Example");
+  }
+}
+~~~
+
+The example shows how to access an Azure table and turn data into Flink's 
`DataSet` (more specifically, the type of the set is `DataSet<Tuple2<Text, 
WritableEntity>>`). With the `DataSet`, you can apply all known transformations 
to the DataSet.
+
+## Access MongoDB
+
+This [GitHub repository documents how to use MongoDB with Apache Flink 
(starting from 0.7-incubating)](https://github.com/okkam-it/flink-mongodb-test).

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/hadoop_compatibility.md
----------------------------------------------------------------------
diff --git a/docs/apis/hadoop_compatibility.md 
b/docs/apis/hadoop_compatibility.md
deleted file mode 100644
index aca1edf..0000000
--- a/docs/apis/hadoop_compatibility.md
+++ /dev/null
@@ -1,246 +0,0 @@
----
-title: "Hadoop Compatibility"
-is_beta: true
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-Flink is compatible with Apache Hadoop MapReduce interfaces and therefore 
allows
-reusing code that was implemented for Hadoop MapReduce.
-
-You can:
-
-- use Hadoop's `Writable` [data types](programming_guide.html#data-types) in 
Flink programs.
-- use any Hadoop `InputFormat` as a 
[DataSource](programming_guide.html#data-sources).
-- use any Hadoop `OutputFormat` as a 
[DataSink](programming_guide.html#data-sinks).
-- use a Hadoop `Mapper` as 
[FlatMapFunction](dataset_transformations.html#flatmap).
-- use a Hadoop `Reducer` as 
[GroupReduceFunction](dataset_transformations.html#groupreduce-on-grouped-dataset).
-
-This document shows how to use existing Hadoop MapReduce code with Flink. 
Please refer to the
-[Connecting to other systems](example_connectors.html) guide for reading from 
Hadoop supported file systems.
-
-* This will be replaced by the TOC
-{:toc}
-
-### Project Configuration
-
-Support for Haddop input/output formats is part of the `flink-java` and
-`flink-scala` Maven modules that are always required when writing Flink jobs.
-The code is located in `org.apache.flink.api.java.hadoop` and
-`org.apache.flink.api.scala.hadoop` in an additional sub-package for the
-`mapred` and `mapreduce` API.
-
-Support for Hadoop Mappers and Reducers is contained in the 
`flink-hadoop-compatibility`
-Maven module.
-This code resides in the `org.apache.flink.hadoopcompatibility`
-package.
-
-Add the following dependency to your `pom.xml` if you want to reuse Mappers
-and Reducers.
-
-~~~xml
-<dependency>
-       <groupId>org.apache.flink</groupId>
-       <artifactId>flink-hadoop-compatibility</artifactId>
-       <version>{{site.version}}</version>
-</dependency>
-~~~
-
-### Using Hadoop Data Types
-
-Flink supports all Hadoop `Writable` and `WritableComparable` data types
-out-of-the-box. You do not need to include the Hadoop Compatibility dependency,
-if you only want to use your Hadoop data types. See the
-[Programming Guide](programming_guide.html#data-types) for more details.
-
-### Using Hadoop InputFormats
-
-Hadoop input formats can be used to create a data source by using
-one of the methods `readHadoopFile` or `createHadoopInput` of the
-`ExecutionEnvironment`. The former is used for input formats derived
-from `FileInputFormat` while the latter has to be used for general purpose
-input formats.
-
-The resulting `DataSet` contains 2-tuples where the first field
-is the key and the second field is the value retrieved from the Hadoop
-InputFormat.
-
-The following example shows how to use Hadoop's `TextInputFormat`.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-~~~java
-ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
-
-DataSet<Tuple2<LongWritable, Text>> input =
-    env.readHadoopFile(new TextInputFormat(), LongWritable.class, Text.class, 
textPath);
-
-// Do something with the data.
-[...]
-~~~
-
-</div>
-<div data-lang="scala" markdown="1">
-
-~~~scala
-val env = ExecutionEnvironment.getExecutionEnvironment
-               
-val input: DataSet[(LongWritable, Text)] =
-  env.readHadoopFile(new TextInputFormat, classOf[LongWritable], 
classOf[Text], textPath)
-
-// Do something with the data.
-[...]
-~~~
-
-</div>
-
-</div>
-
-### Using Hadoop OutputFormats
-
-Flink provides a compatibility wrapper for Hadoop `OutputFormats`. Any class
-that implements `org.apache.hadoop.mapred.OutputFormat` or extends
-`org.apache.hadoop.mapreduce.OutputFormat` is supported.
-The OutputFormat wrapper expects its input data to be a DataSet containing
-2-tuples of key and value. These are to be processed by the Hadoop 
OutputFormat.
-
-The following example shows how to use Hadoop's `TextOutputFormat`.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-~~~java
-// Obtain the result we want to emit
-DataSet<Tuple2<Text, IntWritable>> hadoopResult = [...]
-               
-// Set up the Hadoop TextOutputFormat.
-HadoopOutputFormat<Text, IntWritable> hadoopOF = 
-  // create the Flink wrapper.
-  new HadoopOutputFormat<Text, IntWritable>(
-    // set the Hadoop OutputFormat and specify the job.
-    new TextOutputFormat<Text, IntWritable>(), job
-  );
-hadoopOF.getConfiguration().set("mapreduce.output.textoutputformat.separator", 
" ");
-TextOutputFormat.setOutputPath(job, new Path(outputPath));
-               
-// Emit data using the Hadoop TextOutputFormat.
-hadoopResult.output(hadoopOF);
-~~~
-
-</div>
-<div data-lang="scala" markdown="1">
-
-~~~scala
-// Obtain your result to emit.
-val hadoopResult: DataSet[(Text, IntWritable)] = [...]
-
-val hadoopOF = new HadoopOutputFormat[Text,IntWritable](
-  new TextOutputFormat[Text, IntWritable],
-  new JobConf)
-
-hadoopOF.getJobConf.set("mapred.textoutputformat.separator", " ")
-FileOutputFormat.setOutputPath(hadoopOF.getJobConf, new Path(resultPath))
-
-hadoopResult.output(hadoopOF)
-
-               
-~~~
-
-</div>
-
-</div>
-
-### Using Hadoop Mappers and Reducers
-
-Hadoop Mappers are semantically equivalent to Flink's 
[FlatMapFunctions](dataset_transformations.html#flatmap) and Hadoop Reducers 
are equivalent to Flink's 
[GroupReduceFunctions](dataset_transformations.html#groupreduce-on-grouped-dataset).
 Flink provides wrappers for implementations of Hadoop MapReduce's `Mapper` and 
`Reducer` interfaces, i.e., you can reuse your Hadoop Mappers and Reducers in 
regular Flink programs. At the moment, only the Mapper and Reduce interfaces of 
Hadoop's mapred API (`org.apache.hadoop.mapred`) are supported.
-
-The wrappers take a `DataSet<Tuple2<KEYIN,VALUEIN>>` as input and produce a 
`DataSet<Tuple2<KEYOUT,VALUEOUT>>` as output where `KEYIN` and `KEYOUT` are the 
keys and `VALUEIN` and `VALUEOUT` are the values of the Hadoop key-value pairs 
that are processed by the Hadoop functions. For Reducers, Flink offers a 
wrapper for a GroupReduceFunction with (`HadoopReduceCombineFunction`) and 
without a Combiner (`HadoopReduceFunction`). The wrappers accept an optional 
`JobConf` object to configure the Hadoop Mapper or Reducer.
-
-Flink's function wrappers are 
-
-- `org.apache.flink.hadoopcompatibility.mapred.HadoopMapFunction`,
-- `org.apache.flink.hadoopcompatibility.mapred.HadoopReduceFunction`, and
-- `org.apache.flink.hadoopcompatibility.mapred.HadoopReduceCombineFunction`.
-
-and can be used as regular Flink 
[FlatMapFunctions](dataset_transformations.html#flatmap) or 
[GroupReduceFunctions](dataset_transformations.html#groupreduce-on-grouped-dataset).
-
-The following example shows how to use Hadoop `Mapper` and `Reducer` functions.
-
-~~~java
-// Obtain data to process somehow.
-DataSet<Tuple2<Text, LongWritable>> text = [...]
-
-DataSet<Tuple2<Text, LongWritable>> result = text
-  // use Hadoop Mapper (Tokenizer) as MapFunction
-  .flatMap(new HadoopMapFunction<LongWritable, Text, Text, LongWritable>(
-    new Tokenizer()
-  ))
-  .groupBy(0)
-  // use Hadoop Reducer (Counter) as Reduce- and CombineFunction
-  .reduceGroup(new HadoopReduceCombineFunction<Text, LongWritable, Text, 
LongWritable>(
-    new Counter(), new Counter()
-  ));
-~~~
-
-**Please note:** The Reducer wrapper works on groups as defined by Flink's 
[groupBy()](dataset_transformations.html#transformations-on-grouped-dataset) 
operation. It does not consider any custom partitioners, sort or grouping 
comparators you might have set in the `JobConf`. 
-
-### Complete Hadoop WordCount Example
-
-The following example shows a complete WordCount implementation using Hadoop 
data types, Input- and OutputFormats, and Mapper and Reducer implementations.
-
-~~~java
-ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
-               
-// Set up the Hadoop TextInputFormat.
-Job job = Job.getInstance();
-HadoopInputFormat<LongWritable, Text> hadoopIF = 
-  new HadoopInputFormat<LongWritable, Text>(
-    new TextInputFormat(), LongWritable.class, Text.class, job
-  );
-TextInputFormat.addInputPath(job, new Path(inputPath));
-               
-// Read data using the Hadoop TextInputFormat.
-DataSet<Tuple2<LongWritable, Text>> text = env.createInput(hadoopIF);
-
-DataSet<Tuple2<Text, LongWritable>> result = text
-  // use Hadoop Mapper (Tokenizer) as MapFunction
-  .flatMap(new HadoopMapFunction<LongWritable, Text, Text, LongWritable>(
-    new Tokenizer()
-  ))
-  .groupBy(0)
-  // use Hadoop Reducer (Counter) as Reduce- and CombineFunction
-  .reduceGroup(new HadoopReduceCombineFunction<Text, LongWritable, Text, 
LongWritable>(
-    new Counter(), new Counter()
-  ));
-
-// Set up the Hadoop TextOutputFormat.
-HadoopOutputFormat<Text, IntWritable> hadoopOF = 
-  new HadoopOutputFormat<Text, IntWritable>(
-    new TextOutputFormat<Text, IntWritable>(), job
-  );
-hadoopOF.getConfiguration().set("mapreduce.output.textoutputformat.separator", 
" ");
-TextOutputFormat.setOutputPath(job, new Path(outputPath));
-               
-// Emit data using the Hadoop TextOutputFormat.
-result.output(hadoopOF);
-
-// Execute Program
-env.execute("Hadoop WordCount");
-~~~

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/index.md
----------------------------------------------------------------------
diff --git a/docs/apis/index.md b/docs/apis/index.md
index db82e6f..ab12b79 100644
--- a/docs/apis/index.md
+++ b/docs/apis/index.md
@@ -18,4 +18,4 @@ software distributed under the License is distributed on an
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
--->
\ No newline at end of file
+-->

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/iterations.md
----------------------------------------------------------------------
diff --git a/docs/apis/iterations.md b/docs/apis/iterations.md
deleted file mode 100644
index 54d1b24..0000000
--- a/docs/apis/iterations.md
+++ /dev/null
@@ -1,209 +0,0 @@
----
-title:  "Iterations"
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-Iterative algorithms occur in many domains of data analysis, such as *machine 
learning* or *graph analysis*. Such algorithms are crucial in order to realize 
the promise of Big Data to extract meaningful information out of your data. 
With increasing interest to run these kinds of algorithms on very large data 
sets, there is a need to execute iterations in a massively parallel fashion.
-
-Flink programs implement iterative algorithms by defining a **step function** 
and embedding it into a special iteration operator. There are two  variants of 
this operator: **Iterate** and **Delta Iterate**. Both operators repeatedly 
invoke the step function on the current iteration state until a certain 
termination condition is reached.
-
-Here, we provide background on both operator variants and outline their usage. 
The [programming guide](programming_guide.html) explains how to implement the 
operators in both Scala and Java. We also support both **vertex-centric and 
gather-sum-apply iterations** through Flink's graph processing API, 
[Gelly]({{site.baseurl}}/libs/gelly_guide.html).
-
-The following table provides an overview of both operators:
-
-
-<table class="table table-striped table-hover table-bordered">
-       <thead>
-               <th></th>
-               <th class="text-center">Iterate</th>
-               <th class="text-center">Delta Iterate</th>
-       </thead>
-       <tr>
-               <td class="text-center" width="20%"><strong>Iteration 
Input</strong></td>
-               <td class="text-center" width="40%"><strong>Partial 
Solution</strong></td>
-               <td class="text-center" width="40%"><strong>Workset</strong> 
and <strong>Solution Set</strong></td>
-       </tr>
-       <tr>
-               <td class="text-center"><strong>Step Function</strong></td>
-               <td colspan="2" class="text-center">Arbitrary Data Flows</td>
-       </tr>
-       <tr>
-               <td class="text-center"><strong>State Update</strong></td>
-               <td class="text-center">Next <strong>partial 
solution</strong></td>
-               <td>
-                       <ul>
-                               <li>Next workset</li>
-                               <li><strong>Changes to solution 
set</strong></li>
-                       </ul>
-               </td>
-       </tr>
-       <tr>
-               <td class="text-center"><strong>Iteration Result</strong></td>
-               <td class="text-center">Last partial solution</td>
-               <td class="text-center">Solution set state after last 
iteration</td>
-       </tr>
-       <tr>
-               <td class="text-center"><strong>Termination</strong></td>
-               <td>
-                       <ul>
-                               <li><strong>Maximum number of 
iterations</strong> (default)</li>
-                               <li>Custom aggregator convergence</li>
-                       </ul>
-               </td>
-               <td>
-                       <ul>
-                               <li><strong>Maximum number of iterations or 
empty workset</strong> (default)</li>
-                               <li>Custom aggregator convergence</li>
-                       </ul>
-               </td>
-       </tr>
-</table>
-
-
-* This will be replaced by the TOC
-{:toc}
-
-Iterate Operator
-----------------
-
-The **iterate operator** covers the *simple form of iterations*: in each 
iteration, the **step function** consumes the **entire input** (the *result of 
the previous iteration*, or the *initial data set*), and computes the **next 
version of the partial solution** (e.g. `map`, `reduce`, `join`, etc.).
-
-<p class="text-center">
-    <img alt="Iterate Operator" width="60%" 
src="fig/iterations_iterate_operator.png" />
-</p>
-
-  1. **Iteration Input**: Initial input for the *first iteration* from a *data 
source* or *previous operators*.
-  2. **Step Function**: The step function will be executed in each iteration. 
It is an arbitrary data flow consisting of operators like `map`, `reduce`, 
`join`, etc. and depends on your specific task at hand.
-  3. **Next Partial Solution**: In each iteration, the output of the step 
function will be fed back into the *next iteration*.
-  4. **Iteration Result**: Output of the *last iteration* is written to a 
*data sink* or used as input to the *following operators*.
-
-There are multiple options to specify **termination conditions** for an 
iteration:
-
-  - **Maximum number of iterations**: Without any further conditions, the 
iteration will be executed this many times.
-  - **Custom aggregator convergence**: Iterations allow to specify *custom 
aggregators* and *convergence criteria* like sum aggregate the number of 
emitted records (aggregator) and terminate if this number is zero (convergence 
criterion).
-
-You can also think about the iterate operator in pseudo-code:
-
-~~~java
-IterationState state = getInitialState();
-
-while (!terminationCriterion()) {
-       state = step(state);
-}
-
-setFinalState(state);
-~~~
-
-<div class="panel panel-default">
-       <div class="panel-body">
-       See the <strong><a href="programming_guide.html">Programming Guide</a> 
</strong> for details and code examples.</div>
-</div>
-
-### Example: Incrementing Numbers
-
-In the following example, we **iteratively incremenet a set numbers**:
-
-<p class="text-center">
-    <img alt="Iterate Operator Example" width="60%" 
src="fig/iterations_iterate_operator_example.png" />
-</p>
-
-  1. **Iteration Input**: The inital input is read from a data source and 
consists of five single-field records (integers `1` to `5`).
-  2. **Step function**: The step function is a single `map` operator, which 
increments the integer field from `i` to `i+1`. It will be applied to every 
record of the input.
-  3. **Next Partial Solution**: The output of the step function will be the 
output of the map operator, i.e. records with incremented integers.
-  4. **Iteration Result**: After ten iterations, the initial numbers will have 
been incremented ten times, resulting in integers `11` to `15`.
-
-~~~
-// 1st           2nd                       10th
-map(1) -> 2      map(2) -> 3      ...      map(10) -> 11
-map(2) -> 3      map(3) -> 4      ...      map(11) -> 12
-map(3) -> 4      map(4) -> 5      ...      map(12) -> 13
-map(4) -> 5      map(5) -> 6      ...      map(13) -> 14
-map(5) -> 6      map(6) -> 7      ...      map(14) -> 15
-~~~
-
-Note that **1**, **2**, and **4** can be arbitrary data flows.
-
-
-Delta Iterate Operator
-----------------------
-
-The **delta iterate operator** covers the case of **incremental iterations**. 
Incremental iterations **selectively modify elements** of their **solution** 
and evolve the solution rather than fully recompute it.
-
-Where applicable, this leads to **more efficient algorithms**, because not 
every element in the solution set changes in each iteration. This allows to 
**focus on the hot parts** of the solution and leave the **cold parts 
untouched**. Frequently, the majority of the solution cools down comparatively 
fast and the later iterations operate only on a small subset of the data.
-
-<p class="text-center">
-    <img alt="Delta Iterate Operator" width="60%" 
src="fig/iterations_delta_iterate_operator.png" />
-</p>
-
-  1. **Iteration Input**: The initial workset and solution set are read from 
*data sources* or *previous operators* as input to the first iteration.
-  2. **Step Function**: The step function will be executed in each iteration. 
It is an arbitrary data flow consisting of operators like `map`, `reduce`, 
`join`, etc. and depends on your specific task at hand.
-  3. **Next Workset/Update Solution Set**: The *next workset* drives the 
iterative computation and will be fed back into the *next iteration*. 
Furthermore, the solution set will be updated and implicitly forwarded (it is 
not required to be rebuild). Both data sets can be updated by different 
operators of the step function.
-  4. **Iteration Result**: After the *last iteration*, the *solution set* is 
written to a *data sink* or used as input to the *following operators*.
-
-The default **termination condition** for delta iterations is specified by the 
**empty workset convergence criterion** and a **maximum number of iterations**. 
The iteration will terminate when a produced *next workset* is empty or when 
the maximum number of iterations is reached. It is also possible to specify a 
**custom aggregator** and **convergence criterion**.
-
-You can also think about the iterate operator in pseudo-code:
-
-~~~java
-IterationState workset = getInitialState();
-IterationState solution = getInitialSolution();
-
-while (!terminationCriterion()) {
-       (delta, workset) = step(workset, solution);
-
-       solution.update(delta)
-}
-
-setFinalState(solution);
-~~~
-
-<div class="panel panel-default">
-       <div class="panel-body">
-       See the <strong><a href="programming_guide.html">programming 
guide</a></strong> for details and code examples.</div>
-</div>
-
-### Example: Propagate Minimum in Graph
-
-In the following example, every vertex has an **ID** and a **coloring**. Each 
vertex will propagate its vertex ID to neighboring vertices. The **goal** is to 
*assign the minimum ID to every vertex in a subgraph*. If a received ID is 
smaller then the current one, it changes to the color of the vertex with the 
received ID. One application of this can be found in *community analysis* or 
*connected components* computation.
-
-<p class="text-center">
-    <img alt="Delta Iterate Operator Example" width="100%" 
src="fig/iterations_delta_iterate_operator_example.png" />
-</p>
-
-The **intial input** is set as **both workset and solution set.** In the above 
figure, the colors visualize the **evolution of the solution set**. With each 
iteration, the color of the minimum ID is spreading in the respective subgraph. 
At the same time, the amount of work (exchanged and compared vertex IDs) 
decreases with each iteration. This corresponds to the **decreasing size of the 
workset**, which goes from all seven vertices to zero after three iterations, 
at which time the iteration terminates. The **important observation** is that 
*the lower subgraph converges before the upper half* does and the delta 
iteration is able to capture this with the workset abstraction.
-
-In the upper subgraph **ID 1** (*orange*) is the **minimum ID**. In the 
**first iteration**, it will get propagated to vertex 2, which will 
subsequently change its color to orange. Vertices 3 and 4 will receive **ID 2** 
(in *yellow*) as their current minimum ID and change to yellow. Because the 
color of *vertex 1* didn't change in the first iteration, it can be skipped it 
in the next workset.
-
-In the lower subgraph **ID 5** (*cyan*) is the **minimum ID**. All vertices of 
the lower subgraph will receive it in the first iteration. Again, we can skip 
the unchanged vertices (*vertex 5*) for the next workset.
-
-In the **2nd iteration**, the workset size has already decreased from seven to 
five elements (vertices 2, 3, 4, 6, and 7). These are part of the iteration and 
further propagate their current minimum IDs. After this iteration, the lower 
subgraph has already converged (**cold part** of the graph), as it has no 
elements in the workset, whereas the upper half needs a further iteration 
(**hot part** of the graph) for the two remaining workset elements (vertices 3 
and 4).
-
-The iteration **terminates**, when the workset is empty after the **3rd 
iteration**.
-
-<a href="#supersteps"></a>
-
-Superstep Synchronization
--------------------------
-
-We referred to each execution of the step function of an iteration operator as 
*a single iteration*. In parallel setups, **multiple instances of the step 
function are evaluated in parallel** on different partitions of the iteration 
state. In many settings, one evaluation of the step function on all parallel 
instances forms a so called **superstep**, which is also the granularity of 
synchronization. Therefore, *all* parallel tasks of an iteration need to 
complete the superstep, before a next superstep will be initialized. 
**Termination criteria** will also be evaluated at superstep barriers.
-
-<p class="text-center">
-    <img alt="Supersteps" width="50%" src="fig/iterations_supersteps.png" />
-</p>

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/java8.md
----------------------------------------------------------------------
diff --git a/docs/apis/java8.md b/docs/apis/java8.md
index 6866b95..53269e3 100644
--- a/docs/apis/java8.md
+++ b/docs/apis/java8.md
@@ -1,5 +1,9 @@
 ---
 title: "Java 8 Programming Guide"
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 11
+top-nav-title: Java 8
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -20,8 +24,8 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-Java 8 introduces several new language features designed for faster and 
clearer coding. With the most important feature, 
-the so-called "Lambda Expressions", Java 8 opens the door to functional 
programming. Lambda Expressions allow for implementing and 
+Java 8 introduces several new language features designed for faster and 
clearer coding. With the most important feature,
+the so-called "Lambda Expressions", Java 8 opens the door to functional 
programming. Lambda Expressions allow for implementing and
 passing functions in a straightforward way without having to declare 
additional (anonymous) classes.
 
 The newest version of Flink supports the usage of Lambda Expressions for all 
operators of the Java API.
@@ -33,7 +37,7 @@ Flink API, please refer to the [Programming 
Guide](programming_guide.html)
 
 ### Examples
 
-The following example illustrates how to implement a simple, inline `map()` 
function that squares its input using a Lambda Expression. 
+The following example illustrates how to implement a simple, inline `map()` 
function that squares its input using a Lambda Expression.
 The types of input `i` and output parameters of the `map()` function need not 
to be declared as they are inferred by the Java 8 compiler.
 
 ~~~java
@@ -43,9 +47,9 @@ env.fromElements(1, 2, 3)
 .print();
 ~~~
 
-The next two examples show different implementations of a function that uses a 
`Collector` for output. 
-Functions, such as `flatMap()`, require a output type (in this case `String`) 
to be defined for the `Collector` in order to be type-safe. 
-If the `Collector` type can not be inferred from the surrounding context, it 
need to be declared in the Lambda Expression's parameter list manually. 
+The next two examples show different implementations of a function that uses a 
`Collector` for output.
+Functions, such as `flatMap()`, require a output type (in this case `String`) 
to be defined for the `Collector` in order to be type-safe.
+If the `Collector` type can not be inferred from the surrounding context, it 
need to be declared in the Lambda Expression's parameter list manually.
 Otherwise the output will be treated as type `Object` which can lead to 
undesired behaviour.
 
 ~~~java
@@ -65,7 +69,7 @@ input.flatMap((Integer number, Collector<String> out) -> {
 DataSet<String> input = env.fromElements(1, 2, 3);
 
 // collector type must not be declared, it is inferred from the type of the 
dataset
-DataSet<String> manyALetters = input.flatMap((number, out) -> {        
+DataSet<String> manyALetters = input.flatMap((number, out) -> {
     for(int i = 0; i < number; i++) {
         out.collect("a");
     }
@@ -79,13 +83,13 @@ The following code demonstrates a word count which makes 
extensive use of Lambda
 
 ~~~java
 DataSet<String> input = env.fromElements("Please count", "the words", "but not 
this");
-               
+
 // filter out strings that contain "not"
 input.filter(line -> !line.contains("not"))
 // split each line by space
 .map(line -> line.split(" "))
 // emit a pair <word,1> for each array element
-.flatMap((String[] wordArray, Collector<Tuple2<String, Integer>> out) 
+.flatMap((String[] wordArray, Collector<Tuple2<String, Integer>> out)
     -> Arrays.stream(wordArray).forEach(t -> out.collect(new Tuple2<>(t, 1)))
     )
 // group and sum up
@@ -95,12 +99,12 @@ input.filter(line -> !line.contains("not"))
 ~~~
 
 ### Compiler Limitations
-Currently, Flink only supports jobs containing Lambda Expressions completely 
if they are **compiled with the Eclipse JDT compiler contained in Eclipse Luna 
4.4.2 (and above)**. 
+Currently, Flink only supports jobs containing Lambda Expressions completely 
if they are **compiled with the Eclipse JDT compiler contained in Eclipse Luna 
4.4.2 (and above)**.
 
-Only the Eclipse JDT compiler preserves the generic type information necessary 
to use the entire Lambda Expressions feature type-safely. 
-Other compilers such as the OpenJDK's and Oracle JDK's `javac` throw away all 
generic parameters related to Lambda Expressions. This means that types such as 
`Tuple2<String,Integer` or `Collector<String>` declared as a Lambda function 
input or output parameter will be pruned to `Tuple2` or `Collector` in the 
compiled `.class` files, which is too little information for the Flink 
Compiler. 
+Only the Eclipse JDT compiler preserves the generic type information necessary 
to use the entire Lambda Expressions feature type-safely.
+Other compilers such as the OpenJDK's and Oracle JDK's `javac` throw away all 
generic parameters related to Lambda Expressions. This means that types such as 
`Tuple2<String,Integer` or `Collector<String>` declared as a Lambda function 
input or output parameter will be pruned to `Tuple2` or `Collector` in the 
compiled `.class` files, which is too little information for the Flink Compiler.
 
-How to compile a Flink job that contains Lambda Expressions with the JDT 
compiler will be covered in the next section. 
+How to compile a Flink job that contains Lambda Expressions with the JDT 
compiler will be covered in the next section.
 
 However, it is possible to implement functions such as `map()` or `filter()` 
with Lambda Expressions in Java 8 compilers other than the Eclipse JDT compiler 
as long as the function has no `Collector`s or `Iterable`s *and* only if the 
function handles unparameterized types such as `Integer`, `Long`, `String`, 
`MyOwnClass` (types without Generics!).
 
@@ -108,7 +112,7 @@ However, it is possible to implement functions such as 
`map()` or `filter()` wit
 
 If you are using the Eclipse IDE, you can run and debug your Flink code within 
the IDE without any problems after some configuration steps. The Eclipse IDE by 
default compiles its Java sources with the Eclipse JDT compiler. The next 
section describes how to configure the Eclipse IDE.
 
-If you are using a different IDE such as IntelliJ IDEA or you want to package 
your Jar-File with Maven to run your job on a cluster, you need to modify your 
project's `pom.xml` file and build your program with Maven. The 
[quickstart]({{site.baseurl}}/quickstart/setup_quickstart.html) contains 
preconfigured Maven projects which can be used for new projects or as a 
reference. Uncomment the mentioned lines in your generated quickstart `pom.xml` 
file if you want to use Java 8 with Lambda Expressions. 
+If you are using a different IDE such as IntelliJ IDEA or you want to package 
your Jar-File with Maven to run your job on a cluster, you need to modify your 
project's `pom.xml` file and build your program with Maven. The 
[quickstart]({{site.baseurl}}/quickstart/setup_quickstart.html) contains 
preconfigured Maven projects which can be used for new projects or as a 
reference. Uncomment the mentioned lines in your generated quickstart `pom.xml` 
file if you want to use Java 8 with Lambda Expressions.
 
 Alternatively, you can manually insert the following lines to your Maven 
`pom.xml` file. Maven will then use the Eclipse JDT compiler for compilation.
 
@@ -146,7 +150,7 @@ If you are using Eclipse for development, the m2e plugin 
might complain about th
         <versionRange>[3.1,)</versionRange>
         <goals>
             <goal>testCompile</goal>
-            <goal>compile</goal> 
+            <goal>compile</goal>
         </goals>
     </pluginExecutionFilter>
     <action>
@@ -159,7 +163,7 @@ If you are using Eclipse for development, the m2e plugin 
might complain about th
 
 First of all, make sure you are running a current version of Eclipse IDE 
(4.4.2 or later). Also make sure that you have a Java 8 Runtime Environment 
(JRE) installed in Eclipse IDE (`Window` -> `Preferences` -> `Java` -> 
`Installed JREs`).
 
-Create/Import your Eclipse project. 
+Create/Import your Eclipse project.
 
 If you are using Maven, you also need to change the Java version in your 
`pom.xml` for the `maven-compiler-plugin`. Otherwise right click the `JRE 
System Library` section of your project and open the `Properties` window in 
order to switch to a Java 8 JRE (or above) that supports Lambda Expressions.
 
@@ -177,7 +181,7 @@ org.eclipse.jdt.core.compiler.compliance=1.8
 org.eclipse.jdt.core.compiler.source=1.8
 ~~~
 
-After you have saved the file, perform a complete project refresh in Eclipse 
IDE. 
+After you have saved the file, perform a complete project refresh in Eclipse 
IDE.
 
 If you are using Maven, right click your Eclipse project and select `Maven` -> 
`Update Project...`.
 

http://git-wip-us.apache.org/repos/asf/flink/blob/ad267a4b/docs/apis/local_execution.md
----------------------------------------------------------------------
diff --git a/docs/apis/local_execution.md b/docs/apis/local_execution.md
index dacd114..93d8860 100644
--- a/docs/apis/local_execution.md
+++ b/docs/apis/local_execution.md
@@ -1,5 +1,8 @@
 ---
 title:  "Local Execution"
+# Top-level navigation
+top-nav-group: apis
+top-nav-pos: 7
 ---
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
@@ -99,7 +102,7 @@ Users can use algorithms implemented for batch processing 
also for cases that ar
 public static void main(String[] args) throws Exception {
     // initialize a new Collection-based execution environment
     final ExecutionEnvironment env = new CollectionEnvironment();
-    
+
     DataSet<User> users = env.fromCollection( /* get elements from a Java 
Collection */);
 
     /* Data Set transformations ... */
@@ -107,10 +110,10 @@ public static void main(String[] args) throws Exception {
     // retrieve the resulting Tuple2 elements into a ArrayList.
     Collection<...> result = new ArrayList<...>();
     resultDataSet.output(new LocalCollectionOutputFormat<...>(result));
-    
+
     // kick off execution.
     env.execute();
-    
+
     // Do some work with the resulting ArrayList (=Collection).
     for(... t : result) {
         System.err.println("Result = "+t);

[09/15] flink git commit: [FLINK-3132] [docs] Initial docs restructure

Reply via email to