Author: buildbot
Date: Fri Apr 8 19:10:17 2016
New Revision: 985124
Log:
Staging update by buildbot for mahout
Modified:
websites/staging/mahout/trunk/content/ (props changed)
websites/staging/mahout/trunk/content/users/flinkbindings/flink-internals.html
Propchange: websites/staging/mahout/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Apr 8 19:10:17 2016
@@ -1 +1 @@
-1738283
+1738288
Modified:
websites/staging/mahout/trunk/content/users/flinkbindings/flink-internals.html
==============================================================================
---
websites/staging/mahout/trunk/content/users/flinkbindings/flink-internals.html
(original)
+++
websites/staging/mahout/trunk/content/users/flinkbindings/flink-internals.html
Fri Apr 8 19:10:17 2016
@@ -280,6 +280,17 @@ h2:hover > .headerlink, h3:hover > .head
<p>Apache Flink is a distributed big data streaming engine that supports both
Streaming and Batch interfaces. Batch processing is an extension of Flinkâs
Stream processing engine.</p>
<p>The Mahout Flink integration presently supports Flinkâs batch processing
capabilities leveraging the DataSet API.</p>
<p>The Mahout DRM, or Distributed Row Matrix, is an abstraction for storing a
large matrix of numbers in-memory in a cluster by distributing logical rows
among servers. Mahout's scala DSL provides an abstract API on DRMs for backend
engines to provide implementations of this API. An example is the Spark backend
engine. Each engine has it's own design of mapping the abstract API onto its
data model and provides implementations for algebraic operators over that
mapping.</p>
+<h1 id="flink-overview">Flink Overview<a class="headerlink"
href="#flink-overview" title="Permanent link">¶</a></h1>
+<p>Apache Flink is an open source, distributed Stream and Batch Processing
Framework. At it's core, Flink is a Stream Processing engine and Batch
processing is an extension of Stream Processing. </p>
+<p>Flink includes several APIs for building applications with the Flink
Engine:</p>
+<p><ol>
+<li><b>DataSet API</b> for Batch data in Java, Scala and Python</li>
+<li><b>DataStream API</b> for Stream Processing in Java and Scala</li>
+<li><b>Table API</b> with SQL-like regular expression language in Java and
Scala</li>
+<li><b>Gelly</b> Graph Processing API in Java and Scala</li>
+<li><b>CEP API</b>, a complex event processing library</li>
+<li><b>FlinkML</b>, a Machine Learning library</li>
+</ol></p>
<h1 id="flink-environment-engine">Flink Environment Engine<a
class="headerlink" href="#flink-environment-engine" title="Permanent
link">¶</a></h1>
<p>The Flink backend implements the abstract DRM as a Flink DataSet. A Flink
job runs in the context of an ExecutionEnvironment (from the Flink Batch
processing API).</p>
<h1 id="source-layout">Source Layout<a class="headerlink"
href="#source-layout" title="Permanent link">¶</a></h1>