[10/50] [abbrv] incubator-beam git commit: [readme] update to reflect the current state

davor Fri, 04 Mar 2016 10:11:33 -0800

[readme] update to reflect the current state


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/70ae13c7
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/70ae13c7
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/70ae13c7

Branch: refs/heads/master
Commit: 70ae13c7497907cd7ba81481dc7eafff1615adfb
Parents: 8434c3c
Author: Max <[email protected]>
Authored: Thu Feb 11 12:36:02 2016 +0100
Committer: Davor Bonaci <[email protected]>
Committed: Fri Mar 4 10:04:23 2016 -0800

----------------------------------------------------------------------
 runners/flink/README.md | 82 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 67 insertions(+), 15 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/70ae13c7/runners/flink/README.md
----------------------------------------------------------------------
diff --git a/runners/flink/README.md b/runners/flink/README.md
index 54d248c..499ed6d 100644
--- a/runners/flink/README.md
+++ b/runners/flink/README.md
@@ -1,13 +1,72 @@
 Flink-Dataflow
 --------------
 
-Flink-Dataflow is a Google Dataflow Runner for Apache Flink. It enables you to
-run Dataflow programs with Flink as an execution engine.
+Flink-Dataflow is a Runner for Google Dataflow (aka Apache Beam) which enables 
you to
+run Dataflow programs with Flink. It integrates seamlessly with the Dataflow
+API, allowing you to execute Dataflow programs in streaming or batch mode.
+
+## Streaming
+
+### Full Dataflow Windowing and Triggering Semantics
+
+The Flink Dataflow Runner supports *Event Time* allowing you to analyze data 
with respect to its
+associated timestamp. It handles out-or-order and late-arriving elements. You 
may leverage the full
+power of the Dataflow windowing semantics like *time-based*, *sliding*, 
*tumbling*, or *count*
+windows. You may build *session* windows which allow you to keep track of 
events associated with
+each other.
+
+### Fault-Tolerance
+
+The program's state is persisted by Apache Flink. You may re-run and resume 
your program upon
+failure or if you decide to continue computation at a later time.
+
+### Sources and Sinks
+
+Build your own data ingestion or digestion using the source/sink interface. 
Re-use Flink's sources
+and sinks or use the provided support for Apache Kafka.
+
+### Seamless integration
+
+To execute a Dataflow program in streaming mode, just enable streaming in the 
`PipelineOptions`:
+
+    options.setStreaming(true);
+
+That's it. If you prefer batched execution, simply disable streaming mode.
+
+## Batch
+
+### Batch optimization
+
+Flink gives you out-of-core algorithms which operate on its managed memory to 
perform sorting, 
+caching, and hash table operations. We have optimized operations like CoGroup 
to use Flink's
+optimized out-of-core implementation.
+
+### Fault-Tolerance
+
+We guarantee job-level fault-tolerance which gracefully restarts failed batch 
jobs.
+
+### Sources and Sinks
+
+Build your own data ingestion or digestion using the source/sink interface or 
re-use Flink's sources
+and sinks.
+
+## Features
+
+The Flink Dataflow Runner maintains as much compatibility with the Dataflow 
API as possible. We
+support transformations on data like:
+
+- Grouping
+- Windowing
+- ParDo
+- CoGroup
+- Flatten
+- Combine
+- Side inputs/outputs
+- Encoding
 
 # Getting Started
 
-To get started using Google Dataflow on top of Apache Flink, we need to 
install the
-latest version of Flink-Dataflow.
+To get started using Flink-Dataflow, we first need to install the latest 
version.
 
 ## Install Flink-Dataflow ##
 
@@ -46,7 +105,6 @@ 
p.apply(TextIO.Read.named("ReadLines").from(options.getInput()))
 p.run();
 ```
 
-
 To execute the example, let's first get some sample data:
 
     curl http://www.gutenberg.org/cache/epub/1128/pg1128.txt > kinglear.txt
@@ -58,7 +116,7 @@ Then let's run the included WordCount locally on your 
machine:
 Congratulations, you have run your first Google Dataflow program on top of 
Apache Flink!
 
 
-# Running Dataflow on Flink on a cluster
+# Running Dataflow programs on a Flink cluster
 
 You can run your Dataflow program on an Apache Flink cluster. Please start off 
by creating a new
 Maven project.
@@ -137,14 +195,8 @@ folder to the Flink cluster using the command-line utility 
like so:
 
     ./bin/flink run /path/to/fat.jar
 
-For more information, please visit the [Apache Flink 
Website](http://flink.apache.org) or contact
-the [Mailinglists](http://flink.apache.org/community.html#mailing-lists).
-
-# Streaming
 
-Streaming support has been added. It is currently in alpha stage. Please give 
it a try. To use
-streaming, just enable streaming mode in the `PipelineOptions`:
+# More
 
-    options.setStreaming(true);
-
-That's all.
\ No newline at end of file
+For more information, please visit the [Apache Flink 
Website](http://flink.apache.org) or contact
+the [Mailinglists](http://flink.apache.org/community.html#mailing-lists).
\ No newline at end of file

[10/50] [abbrv] incubator-beam git commit: [readme] update to reflect the current state

Reply via email to