%3CCAB8MnHW6mE3GXvemDStCvn_1zMxqXj0ZWLJBgO9hNuHed9ue%2Bw%40mail.g

takidau Sat, 19 Mar 2016 14:10:03 -0700

Capability matrix page + blog post:
- Content as discussed in this thread:
    
http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201603.mbox/%3CCAB8MnHW6mE3GXvemDStCvn_1zMxqXj0ZWLJBgO9hNuHed9ue%2Bw%40mail.gmail.com%3E
  and as iterated upon by relevant committers in this doc:
    
https://docs.google.com/spreadsheets/d/1OM077lZBARrtUi6g0X0O0PHaIbFKCD6v0djRefQRE1I/edit
- Enumerates current capabilities per-runner in _data/capability-matrix.yml.
  This file should be kept up to date over time as runners evolve, new
  runners are added, etc.
- Creates new page with live capability matrix, which will be updated over
  time via changes to the aforementioned YAML file.
- Create new blog post with summary snapshot of the current matrix.
- Update authors support in blog post templates to handle multiple authors.



Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/931d7f51
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/931d7f51
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/931d7f51

Branch: refs/heads/asf-site
Commit: 931d7f51002341201bb3534db81da46e54c97716
Parents: a8ebbad
Author: Tyler Akidau <[email protected]>
Authored: Thu Mar 17 14:40:19 2016 -0700
Committer: Tyler Akidau <[email protected]>
Committed: Thu Mar 17 14:52:09 2016 -0700

----------------------------------------------------------------------
 _data/authors.yml                               |   8 +
 _data/capability-matrix.yml                     | 561 +++++++++++++
 _includes/authors-list.md                       |   1 +
 _includes/capability-matrix-common.md           |   7 +
 _includes/capability-matrix-row-blog.md         |   1 +
 _includes/capability-matrix-row-full.md         |   1 +
 _includes/capability-matrix-row-summary.md      |   1 +
 _includes/capability-matrix.md                  |  28 +
 _includes/header.html                           |  13 +-
 _layouts/post.html                              |   4 +-
 _pages/blog.md                                  |   5 +-
 _pages/capability-matrix.md                     |  41 +
 _posts/2016-02-22-beam-has-a-logo.markdown      |   3 +-
 _posts/2016-02-22-beam-has-a-logo0.markdown     |   3 +-
 _posts/2016-03-17-compatability-matrix.md       | 596 ++++++++++++++
 _sass/capability-matrix.scss                    | 127 +++
 .../python/sdk/2016/02/25/beam-has-a-logo0.html |   8 +-
 .../website/2016/02/22/beam-has-a-logo.html     |  10 +-
 content/blog/index.html                         | 795 +++++++++++++++++-
 content/feed.xml                                | 807 ++++++++++++++++++-
 content/getting_started/index.html              |   1 +
 content/index.html                              |   3 +
 content/issue_tracking/index.html               |   1 +
 content/mailing_lists/index.html                |   1 +
 content/privacy_policy/index.html               |   1 +
 content/source_repository/index.html            |  10 +-
 content/styles/site.css                         | 107 +++
 content/team/index.html                         |   1 +
 styles/site.scss                                |   1 +
 29 files changed, 3118 insertions(+), 28 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_data/authors.yml
----------------------------------------------------------------------
diff --git a/_data/authors.yml b/_data/authors.yml
index f76b24f..b71ad8a 100644
--- a/_data/authors.yml
+++ b/_data/authors.yml
@@ -1,4 +1,12 @@
+fjp:
+    name: Frances Perry
+    email: [email protected]
+    twitter: francesjperry
 jamesmalone:
     name: James Malone
     email: [email protected]
     twitter: chimerasaurus
+takidau:
+    name: Tyler Akidau
+    email: [email protected]
+    twitter: takidau

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_data/capability-matrix.yml
----------------------------------------------------------------------
diff --git a/_data/capability-matrix.yml b/_data/capability-matrix.yml
new file mode 100644
index 0000000..785854a
--- /dev/null
+++ b/_data/capability-matrix.yml
@@ -0,0 +1,561 @@
+columns:
+  - class: model
+    name: Beam Model
+  - class: dataflow
+    name: Google Cloud Dataflow
+  - class: flink
+    name: Apache Flink
+  - class: spark
+    name: Apache Spark
+
+categories:
+  - description: What is being computed?
+    anchor: what
+    color-b: 'ca1'
+    color-y: 'ec3'
+    color-p: 'fe5'
+    color-n: 'ddd'
+    rows:
+      - name: ParDo
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: element-wise processing
+            l3: Element-wise transformation parameterized by a chunk of user 
code. Elements are processed in bundles, with initialization and termination 
hooks. Bundle size is chosen by the runner and cannot be controlled by user 
code. ParDo processes a main input PCollection one element at a time, but 
provides side input access to additional PCollections.
+          - class: dataflow
+            l1: 'Yes'
+            l2: fully supported
+            l3: Batch mode uses large bundle sizes. Streaming uses smaller 
bundle sizes.
+          - class: flink
+            l1: 'Yes'
+            l2: fully supported
+            l3: ParDo itself, as per-element transformation with UDFs, is 
fully supported by Flink for both batch and streaming.
+          - class: spark
+            l1: 'Yes'
+            l2: fully supported
+            l3: ParDo applies per-element transformations as Spark 
FlatMapFunction.
+      - name: GroupByKey
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: key grouping
+            l3: Grouping of key-value pairs per key, window, and pane. (See 
also other tabs.)
+          - class: dataflow
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
+          - class: flink
+            l1: 'Yes'
+            l2: fully supported
+            l3: "Uses Flink's keyBy for key grouping. When grouping by window 
in streaming (creating the panes) the Flink runner uses the Beam code. This 
guarantees support for all windowing and triggering mechanisms."
+          - class: spark
+            l1: 'Partially'
+            l2: group by window in batch only
+            l3: "Uses Spark's groupByKey for grouping. Grouping by window is 
currently only supported in batch."
+      - name: Flatten
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: collection concatenation
+            l3: Concatenates multiple homogenously typed collections together.
+          - class: dataflow
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
+          - class: flink
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
+          - class: spark
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
+
+      - name: Combine
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: associative &amp; commutative aggregation
+            l3: 'Application of an associative, commutative operation over all 
values ("globally") or over all values associated with each key ("per key"). 
Can be implemented using ParDo, but often more efficient implementations exist.'
+          - class: dataflow
+            l1: 'Yes'
+            l2: 'efficient execution'
+            l3: ''
+          - class: flink
+            l1: 'Yes'
+            l2: 'fully supported'
+            l3: Uses a combiner for pre-aggregation for batch and streaming.
+          - class: spark
+            l1: 'Yes'
+            l2: fully supported
+            l3: Supports GroupedValues, Globally and PerKey.
+
+      - name: Composite Transforms
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: user-defined transformation subgraphs
+            l3: Allows easy extensibility for library writers.  In the near 
future, we expect there to be more information provided at this level -- 
customized metadata hooks for monitoring, additional runtime/environment hooks, 
etc.
+          - class: dataflow
+            l1: 'Partially'
+            l2: supported via inlining
+            l3: Currently composite transformations are inlined during 
execution. The structure is later recreated from the names, but other transform 
level information (if added to the model) will be lost.
+          - class: flink
+            l1: 'Partially'
+            l2: supported via inlining
+            l3: ''
+          - class: spark
+            l1: 'Partially'
+            l2: supported via inlining
+            l3: ''
+
+      - name: Side Inputs
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: additional elements available during DoFn execution
+            l3: Side inputs are additional <tt>PCollections</tt> whose 
contents are computed during pipeline execution and then made accessible to 
DoFn code. The exact shape of the side input depends both on the 
<tt>PCollectionView</tt> used to describe the access pattern (interable, map, 
singleton) and the window of the element from the main input that is currently 
being processed.
+          - class: dataflow
+            l1: 'Yes'
+            l2: some size restrictions in streaming
+            l3: Batch implemented supports a distributed implementation, but 
streaming mode may force some size restrictions. Neither mode is able to push 
lookups directly up into key-based sources.
+          - class: flink
+            jira: BEAM-102
+            l1: 'Partially'
+            l2: no supported in streaming
+            l3: Supported in batch. Side inputs for streaming are currently 
WiP.
+          - class: spark
+            l1: 'Partially'
+            l2: not supported in streaming
+            l3: "Side input is actually a broadcast variable in Spark so it 
can't be updated during the life of a job. Spark-runner implementation of side 
input is more of an immutable, static, side input."
+
+      - name: Source API
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: user-defined sources
+            l3: Allows users to provide additional input sources. Supports 
both bounded and unbounded data. Includes hooks necessary to provide efficient 
parallelization (size estimation, progress information, dynamic splitting, etc).
+          - class: dataflow
+            l1: 'Yes'
+            l2: fully supported
+            l3: 
+          - class: flink
+            jira: BEAM-103
+            l1: 'Partially'
+            l2: parallelism 1 in streaming
+            l3: Fully supported in batch. In streaming, sources currently run 
with parallelism 1.
+          - class: spark
+            l1: 'Yes'
+            l2: fully supported
+            l3: 
+
+      - name: Aggregators
+        values:
+          - class: model
+            l1: 'Partially'
+            l2: user-provided metrics
+            l3: Allow transforms to aggregate simple metrics across bundles in 
a <tt>DoFn</tt>. Semantically equivalent to using a side output, but support 
partial results as the transform executes. Will likely want to augment 
<tt>Aggregators</tt> to be more useful for processing unbounded data by making 
them windowed.
+          - class: dataflow
+            l1: 'Partially'
+            l2: may miscount in streaming mode
+            l3: Current model is fully supported in batch mode. In streaming 
mode, <tt>Aggregators</tt> may under or overcount when bundles are retried.
+          - class: flink
+            l1: 'Partially'
+            l2: may undercount in streaming
+            l3: Current model is fully supported in batch. In streaming mode, 
<tt>Aggregators</tt> may undercount.
+          - class: spark
+            l1: 'Partially'
+            l2: streaming requires more testing
+            l3: "Uses Spark's <tt>AccumulatorParam</tt> mechanism"
+
+      - name: Keyed State
+        values:
+          - class: model
+            jira: BEAM-25
+            l1: 'No'
+            l2: storage per key, per window
+            l3: Allows fine-grained access to per-key, per-window persistent 
state. Necessary for certain use cases (e.g. high-volume windows which store 
large amounts of data, but typically only access small portions of it; complex 
state machines; etc.) that are not easily or efficiently addressed via 
<tt>Combine</tt> or <tt>GroupByKey</tt>+<tt>ParDo</tt>. 
+          - class: dataflow
+            l1: 'No'
+            l2: pending model support
+            l3: Dataflow already supports keyed state internally, so adding 
support for this should be easy once the Beam model exposes it.
+          - class: flink
+            l1: 'No'
+            l2: pending model support
+            l3: Flink already supports keyed state, so adding support for this 
should be easy once the Beam model exposes it.
+          - class: spark
+            l1: 'No'
+            l2: pending model support
+            l3: Spark supports keyed state with mapWithState() so support 
shuold be straight forward.
+
+
+  - description: Where in event time?
+    anchor: where
+    color-b: '37d'
+    color-y: '59f'
+    color-p: '8cf'
+    color-n: 'ddd'
+    rows:
+      - name: Global windows
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: all time
+            l3: The default window which covers all of time. (Basically how 
traditional batch cases fit in the model.)
+          - class: dataflow
+            l1: 'Yes'
+            l2: default
+            l3: ''
+          - class: flink
+            l1: 'Yes'
+            l2: supported
+            l3: ''
+          - class: spark
+            l1: 'Yes'
+            l2: supported
+            l3: ''
+
+      - name: Fixed windows
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: periodic, non-overlapping
+            l3: Fixed-size, timestamp-based windows. (Hourly, Daily, etc)
+          - class: dataflow
+            l1: 'Yes'
+            l2: built-in
+            l3: ''
+          - class: flink
+            l1: 'Yes'
+            l2: supported
+            l3: ''
+          - class: spark
+            l1: Partially
+            l2: currently only supported in batch
+            l3: ''
+
+      - name: Sliding windows
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: periodic, overlapping
+            l3: Possibly overlapping fixed-size timestamp-based windows (Every 
minute, use the last ten minutes of data.)
+          - class: dataflow
+            l1: 'Yes'
+            l2: built-in
+            l3: ''
+          - class: flink
+            l1: 'Yes'
+            l2: supported
+            l3: ''
+          - class: spark
+            l1: 'No'
+            l2: ''
+            l3: ''
+
+      - name: Session windows
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: activity-based
+            l3: Based on bursts of activity separated by a gap size. Different 
per key.
+          - class: dataflow
+            l1: 'Yes'
+            l2: built-in
+            l3: ''
+          - class: flink
+            l1: 'Yes'
+            l2: supported
+            l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+          - class: spark
+            l1: 'No'
+            l2: pending Spark engine support
+            l3: ''
+
+      - name: Custom windows
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: user-defined windows
+            l3: All windows must implement <tt>BoundedWindow</tt>, which 
specifies a max timestamp. Each <tt>WindowFn</tt> assigns elements to an 
associated window.
+          - class: dataflow
+            l1: 'Yes'
+            l2: supported
+            l3: ''
+          - class: flink
+            l1: 'Yes'
+            l2: supported
+            l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+          - class: spark
+            l1: 'No'
+            l2: pending Spark engine support
+            l3: ''
+
+      - name: Custom merging windows
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: user-defined merging windows
+            l3: A custom <tt>WindowFn</tt> additionally specifies whether and 
how to merge windows.
+          - class: dataflow
+            l1: 'Yes'
+            l2: supported
+            l3: ''
+          - class: flink
+            l1: 'Yes'
+            l2: supported
+            l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+          - class: spark
+            l1: 'No'
+            l2: pending Spark engine support
+            l3: ''
+
+      - name: Timestamp control
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: output timestamp for window panes
+            l3: For a grouping transform, such as GBK or Combine, an 
OutputTimeFn specifies (1) how to combine input timestamps within a window and 
(2) how to merge aggregated timestamps when windows merge.
+          - class: dataflow
+            l1: 'Yes'
+            l2: supported
+            l3: ''
+          - class: flink
+            l1: 'Yes'
+            l2: supported
+            l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+          - class: spark
+            l1: 'No'
+            l2: pending Spark engine support
+            l3: ''
+
+
+
+  - description: When in processing time?
+    anchor: when
+    color-b: '6a4'
+    color-y: '8c6'
+    color-p: 'ae8'
+    color-n: 'ddd'
+    rows:
+
+      - name: Configurable triggering
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: user customizable
+            l3: Triggering may be specified by the user (instead of simply 
driven by hardcoded defaults).
+          - class: dataflow
+            l1: 'Yes'
+            l2: fully supported
+            l3: Fully supported in streaming mode. In batch mode, intermediate 
trigger firings are effectively meaningless.
+          - class: flink
+            l1: 'Yes'
+            l2: fully supported
+            l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+          - class: spark
+            l1: 'No'
+            l2: ''
+            l3: ''
+
+      - name: Event-time triggers
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: relative to event time
+            l3: Triggers that fire in response to event-time completeness 
signals, such as watermarks progressing.
+          - class: dataflow
+            l1: 'Yes'
+            l2: yes in streaming, fixed granularity in batch
+            l3: Fully supported in streaming mode. In batch mode, currently 
watermark progress jumps from the beginning of time to the end of time once the 
input has been fully consumed, thus no additional triggering granularity is 
available.
+          - class: flink
+            l1: 'Yes'
+            l2: fully supported
+            l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+          - class: spark
+            l1: 'No'
+            l2: ''
+            l3: ''
+
+      - name: Processing-time triggers
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: relative to processing time
+            l3: Triggers that fire in response to processing-time advancing.
+          - class: dataflow
+            l1: 'Yes'
+            l2: yes in streaming, fixed granularity in batch
+            l3: Fully supported in streaming mode. In batch mode, from the 
perspective of triggers, processing time currently jumps from the beginning of 
time to the end of time once the input has been fully consumed, thus no 
additional triggering granularity is available.
+          - class: flink
+            l1: 'Yes'
+            l2: fully supported
+            l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+          - class: spark
+            l1: 'Yes'
+            l2: "This is Spark streaming's native model"
+            l3: "Spark processes streams in micro-batches. The micro-batch 
size is actually a pre-set, fixed, time interval. Currently, the runner takes 
the first window size in the pipeline and sets it's size as the batch interval. 
Any following window operations will be considered processing time windows and 
will affect triggering."
+
+      - name: Count triggers
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: every N elements
+            l3: Triggers that fire after seeing at least N elements.
+          - class: dataflow
+            l1: 'Yes'
+            l2: fully supported
+            l3: Fully supported in streaming mode. In batch mode, elements are 
processed in the largest bundles possible, so count-based triggers are 
effectively meaningless.
+          - class: flink
+            l1: 'Yes'
+            l2: fully supported
+            l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+          - class: spark
+            l1: 'No'
+            l2: ''
+            l3: ''
+
+      - name: '[Meta]data driven triggers'
+        values:
+          - class: model
+            jira: BEAM-101
+            l1: 'No'
+            l2: in response to data
+            l3: Triggers that fire in response to attributes of the data being 
processed.
+          - class: dataflow
+            l1: 'No'
+            l2: pending model support
+            l3: 
+          - class: flink
+            l1: 'No'
+            l2: pending model support
+            l3: 
+          - class: spark
+            l1: 'No'
+            l2: pending model support
+            l3: 
+
+      - name: Composite triggers
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: compositions of one or more sub-triggers
+            l3: Triggers which compose other triggers in more complex 
structures, such as logical AND, logical OR, early/on-time/late, etc.
+          - class: dataflow
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
+          - class: flink
+            l1: 'Yes'
+            l2: fully supported
+            l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+          - class: spark
+            l1: 'No'
+            l2: ''
+            l3: ''
+
+      - name: Allowed lateness
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: event-time bound on window lifetimes
+            l3: A way to bound the useful lifetime of a window (in event 
time), after which any unemitted results may be materialized, the window 
contents may be garbage collected, and any addtional late data that arrive for 
the window may be discarded.
+          - class: dataflow
+            l1: 'Yes'
+            l2: fully supported
+            l3: Fully supported in streaming mode. In batch mode no data is 
ever late.
+          - class: flink
+            l1: 'Yes'
+            l2: fully supported
+            l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+          - class: spark
+            l1: 'No'
+            l2: ''
+            l3: ''
+
+      - name: Timers
+        values:
+          - class: model
+            jira: BEAM-27
+            l1: 'No'
+            l2: delayed processing callbacks
+            l3: A fine-grained mechanism for performing work at some point in 
the future, in either the event-time or processing-time domain. Useful for 
orchestrating delayed events, timeouts, etc in complex state per-key, 
per-window state machines.
+          - class: dataflow
+            l1: 'No'
+            l2: pending model support
+            l3: Dataflow already supports timers internally, so adding support 
for this should be easy once the Beam model exposes it.
+          - class: flink
+            l1: 'No'
+            l2: pending model support
+            l3: Flink already supports timers internally, so adding support 
for this should be easy once the Beam model exposes it.
+          - class: spark
+            l1: 'No'
+            l2: pending model support
+            l3: ''
+
+
+  - description: How do refinements relate?
+    anchor: how
+    color-b: 'b55'
+    color-y: 'd77'
+    color-p: 'faa'
+    color-n: 'ddd'
+    rows:
+
+      - name: Discarding
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: panes discard elements when fired
+            l3: Elements are discarded from accumulated state as their pane is 
fired.
+          - class: dataflow
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
+          - class: flink
+            l1: 'Yes'
+            l2: fully supported
+            l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+          - class: spark
+            l1: 'Yes'
+            l2: fully supported
+            l3: 'Spark streaming natively discards elements after firing.'
+
+      - name: Accumulating
+        values:
+          - class: model
+            l1: 'Yes'
+            l2: panes accumulate elements across firings
+            l3: Elements are accumulated in state across multiple pane firings 
for the same window.
+          - class: dataflow
+            l1: 'Yes'
+            l2: fully supported
+            l3: Requires that the accumulated pane fits in memory, after being 
passed through the combiner (if relevant)
+          - class: flink
+            l1: 'Yes'
+            l2: fully supported
+            l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+          - class: spark
+            l1: 'No'
+            l2: ''
+            l3: ''
+
+      - name: 'Accumulating &amp; Retracting'
+        values:
+          - class: model
+            jira: BEAM-91
+            l1: 'No'
+            l2: accumulation plus retraction of old panes
+            l3: Elements are accumulated across multiple pane firings and old 
emitted values are retracted. Also known as "backsies" ;-D
+          - class: dataflow
+            l1: 'No'
+            l2: pending model support
+            l3: ''
+          - class: flink
+            l1: 'No'
+            l2: pending model support
+            l3: ''
+          - class: spark
+            l1: 'No'
+            l2: pending model support
+            l3: ''

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_includes/authors-list.md
----------------------------------------------------------------------
diff --git a/_includes/authors-list.md b/_includes/authors-list.md
new file mode 100644
index 0000000..1207445
--- /dev/null
+++ b/_includes/authors-list.md
@@ -0,0 +1 @@
+{% assign count = authors | size %}{% for name in authors %}{% if 
forloop.first == false and count > 2 %},{% endif %}{% if forloop.last and count 
> 1 %} &amp;{% endif %}{% assign author = site.data.authors[name] %} {{ 
author.name }} [<a href="https://twitter.com/{{ author.twitter }}">@{{ 
author.twitter }}</a>]{% endfor %}

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_includes/capability-matrix-common.md
----------------------------------------------------------------------
diff --git a/_includes/capability-matrix-common.md 
b/_includes/capability-matrix-common.md
new file mode 100644
index 0000000..78b20e9
--- /dev/null
+++ b/_includes/capability-matrix-common.md
@@ -0,0 +1,7 @@
+<script type="text/javascript">
+  function ToggleTables(showDetails, anchor) {
+    document.getElementById("cap-summary").style.display = showDetails ? 
"none" : "block";
+    document.getElementById("cap-full").style.display = showDetails ? "block" 
: "none";
+    location.hash = anchor;
+  }
+</script>

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_includes/capability-matrix-row-blog.md
----------------------------------------------------------------------
diff --git a/_includes/capability-matrix-row-blog.md 
b/_includes/capability-matrix-row-blog.md
new file mode 100644
index 0000000..bd3da68
--- /dev/null
+++ b/_includes/capability-matrix-row-blog.md
@@ -0,0 +1 @@
+<b><center>{% if val.l1 == 'Yes' %}&#x2713;{% elsif val.l1 == 'Partially' 
%}~{% else %}&#x2715;{% endif %}</center></b>

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_includes/capability-matrix-row-full.md
----------------------------------------------------------------------
diff --git a/_includes/capability-matrix-row-full.md 
b/_includes/capability-matrix-row-full.md
new file mode 100644
index 0000000..3734a98
--- /dev/null
+++ b/_includes/capability-matrix-row-full.md
@@ -0,0 +1 @@
+<b><center>{{ val.l1 }}{% if val.l2 != '' %}: {{ val.l2 }}{% endif %}{% if 
val.jira %}<br>(<a href='https://issues.apache.org/jira/browse/{{ val.jira 
}}'>{{ val.jira }}</a>){% endif %}</center></b><br>{{ val.l3 }}

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_includes/capability-matrix-row-summary.md
----------------------------------------------------------------------
diff --git a/_includes/capability-matrix-row-summary.md 
b/_includes/capability-matrix-row-summary.md
new file mode 100644
index 0000000..922802a
--- /dev/null
+++ b/_includes/capability-matrix-row-summary.md
@@ -0,0 +1 @@
+<b><center>{% if val.l1 == 'Yes' %}&#x2713;{% elsif val.l1 == 'Partially' 
%}~{% else %}&#x2715;{% endif %}{% if val.jira %} (<a 
href='https://issues.apache.org/jira/browse/{{ val.jira }}'>{{ val.jira 
}}</a>){% endif %}</center></b>

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_includes/capability-matrix.md
----------------------------------------------------------------------
diff --git a/_includes/capability-matrix.md b/_includes/capability-matrix.md
new file mode 100644
index 0000000..222671e
--- /dev/null
+++ b/_includes/capability-matrix.md
@@ -0,0 +1,28 @@
+<div id='cap-{{ cap-view }}' style='display:{{ cap-display }}'>
+<table class='{{ cap-style }}'>
+  {% for category in cap-data.categories %}
+  <tr class='{{ cap-style }}' id='cap-{{ cap-view }}-{{ category.anchor }}'>
+    <th class='{{ cap-style }} color-metadata format-category' colspan='5' 
style='color:#{{ category.color-b }}'>{% if cap-view != 'blog' %}<div 
class='cap-toggle' onclick='ToggleTables({{ cap-toggle-details }}, "cap-{{ 
cap-other-view }}-{{ category.anchor }}")'>({% if cap-toggle-details == 1 
%}expand{% else %}collapse{% endif %} details)</div>{% endif %}{{ 
category.description }}</th>
+  </tr>
+  <tr class='{{ cap-style }}'>
+    <th class='{{ cap-style }} color-capability'></th>
+  {% for x in cap-data.columns %}
+    <th class='{{ cap-style }} color-platform format-platform' 
style='color:#{{ category.color-y }}'>{{ x.name }}</th>
+  {% endfor %}
+  </tr>
+  {% for row in category.rows %}
+  <tr class='{{ cap-style }}'>
+    <th class='{{ cap-style }} color-capability format-capability' 
style='color:#{{ category.color-y }}'>{{ row.name }}</th>
+    {% for val in row.values %}
+    {% capture value-markdown %}{% include capability-matrix-row-{{ cap-view 
}}.md %}{% endcapture %}
+
+    <td width='25%' class='{{ cap-style }}' style='background-color:#{% if 
val.l1 == 'Yes' %}{{ category.color-y }}{% elsif val.l1 == 'Partially' %}{{ 
category.color-p }}{% else %}{{ category.color-n }}{% endif 
%};border-color:#{{category.color-b}}'>{{ value-markdown }}</td>
+    {% endfor %}
+  </tr>
+  {% endfor %}
+  <tr class='{{ cap-style }}'>
+    <td class='{{ cap-style }} color-blank cap-blank' colspan='5'></td>
+  </tr>
+  {% endfor %}
+</table>
+</div>

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_includes/header.html
----------------------------------------------------------------------
diff --git a/_includes/header.html b/_includes/header.html
index 912d554..f92f620 100644
--- a/_includes/header.html
+++ b/_includes/header.html
@@ -10,7 +10,8 @@
         <li class="dropdown">
           <a href="#" class="dropdown-toggle" data-toggle="dropdown" 
role="button" aria-haspopup="true" aria-expanded="false">Documentation <span 
class="caret"></span></a>
           <ul class="dropdown-menu">
-            <li><a href="/getting_started/">Getting Started</a></li>
+            <li><a href="{{ site.baseurl }}/getting_started/">Getting 
Started</a></li>
+           <li><a href="{{ site.baseurl }}/capability-matrix/">Capability 
Matrix</a></li>
             <li><a href="https://goo.gl/ps8twC";>Technical Docs</a></li>
             <li><a href="https://goo.gl/nk5OM0";>Technical Vision</a></li>
           </ul>
@@ -19,17 +20,17 @@
           <a href="#" class="dropdown-toggle" data-toggle="dropdown" 
role="button" aria-haspopup="true" aria-expanded="false">Community <span 
class="caret"></span></a>
           <ul class="dropdown-menu">
             <li class="dropdown-header">Community</li>
-            <li><a href="/mailing_lists/">Mailing Lists</a></li>
+            <li><a href="{{ site.baseurl }}/mailing_lists/">Mailing 
Lists</a></li>
             <li><a href="https://goo.gl/ps8twC";>Technical Docs</a></li>
             <li><a href="https://goo.gl/nk5OM0";>Technical Vision</a></li>
-            <li><a href="/team/">Apache Beam Team</a></li>
+            <li><a href="{{ site.baseurl }}/team/">Apache Beam Team</a></li>
             <li role="separator" class="divider"></li>
             <li class="dropdown-header">Contribute</li>
-            <li><a href="/source_repository/">Source Repository</a></li>
-            <li><a href="/issue_tracking/">Issue Tracking</a></li>
+            <li><a href="{{ site.baseurl }}/source_repository/">Source 
Repository</a></li>
+            <li><a href="{{ site.baseurl }}/issue_tracking/">Issue 
Tracking</a></li>
           </ul>
         </li>
-        <li><a href="/blog">Blog</a></li>
+        <li><a href="{{ site.baseurl }}/blog">Blog</a></li>
       </ul>
     </div><!--/.nav-collapse -->
   </div>

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_layouts/post.html
----------------------------------------------------------------------
diff --git a/_layouts/post.html b/_layouts/post.html
index 3a0fb52..696bad2 100644
--- a/_layouts/post.html
+++ b/_layouts/post.html
@@ -1,11 +1,13 @@
 ---
 layout: default
 ---
+{% assign authors = page.authors %}
+
 <article class="post" itemscope itemtype="http://schema.org/BlogPosting";>
 
   <header class="post-header">
     <h1 class="post-title" itemprop="name headline">{{ page.title }}</h1>
-    <p class="post-meta"><time datetime="{{ page.date | date_to_xmlschema }}" 
itemprop="datePublished">{{ page.date | date: "%b %-d, %Y" }}</time>{% if 
page.author %} â¢ <span itemprop="author" itemscope 
itemtype="http://schema.org/Person";><span itemprop="name">{{ page.author 
}}</span></span>{% endif %}</p>
+    <p class="post-meta"><time datetime="{{ page.date | date_to_xmlschema }}" 
itemprop="datePublished">{{ page.date | date: "%b %-d, %Y" }}</time>{% if 
authors %} â¢ {% include authors-list.md %}{% endif %}</p>
   </header>
 
   <div class="post-content" itemprop="articleBody">

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_pages/blog.md
----------------------------------------------------------------------
diff --git a/_pages/blog.md b/_pages/blog.md
index bd58119..5cf3b8e 100644
--- a/_pages/blog.md
+++ b/_pages/blog.md
@@ -8,11 +8,10 @@ This is the blog for the Apache Beam project. This blog 
contains news and update
 for the project.
 
 {% for post in site.posts %}
-{% assign author = site.data.authors[post.author] %}
+{% assign authors = post.authors %}
 
 ### <a class="post-link" href="{{ post.url | prepend: site.baseurl }}">{{ 
post.title }}</a>
-<i>{{ post.date | date: "%b %-d, %Y" }}{% if author %} - posted by {{ 
author.name }} [<a href="https://twitter.com/{{ author.twitter }}">@{{ 
author.twitter }}</a>]
-{% endif %}</i>
+<i>{{ post.date | date: "%b %-d, %Y" }}{% if authors %} â¢ {% include 
authors-list.md %}{% endif %}</i>
 
 {{ post.excerpt }}
 

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_pages/capability-matrix.md
----------------------------------------------------------------------
diff --git a/_pages/capability-matrix.md b/_pages/capability-matrix.md
new file mode 100644
index 0000000..f1b4b05
--- /dev/null
+++ b/_pages/capability-matrix.md
@@ -0,0 +1,41 @@
+---
+layout: default
+title: "Apache Beam Capability Matrix"
+permalink: /capability-matrix/
+---
+
+# Apache Beam Capability Matrix
+
+Apache Beam (incubating) provides a portable API layer for building 
sophisticated data-parallel processing engines that may be executed across a 
diversity of exeuction engines, or <i>runners</i>. The core concepts of this 
layer are based upon the Beam Model (formerly referred to as the [Dataflow 
Model](http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf)), and implemented to 
varying degrees in each Beam runner. To help clarify the capabilities of 
individual runners, we've created the capability matrix below.
+
+Individual capabilities have been grouped by their corresponding <span 
class="wwwh-what-dark">What</span> / <span class="wwwh-where-dark">Where</span> 
/ <span class="wwwh-when-dark">When</span> / <span 
class="wwwh-how-dark">How</span> question:
+
+- <span class="wwwh-what-dark">What</span> results are being calculated?
+- <span class="wwwh-where-dark">Where</span> in event time?
+- <span class="wwwh-when-dark">When</span> in processing time?
+- <span class="wwwh-how-dark">How</span> do refinements of results relate?
+
+For more details on the <span class="wwwh-what-dark">What</span> / <span 
class="wwwh-where-dark">Where</span> / <span class="wwwh-when-dark">When</span> 
/ <span class="wwwh-how-dark">How</span> breakdown of concepts, we recommend 
reading through the <a 
href="http://oreilly.com/ideas/the-world-beyond-batch-streaming-102";>Streaming 
102</a> post on O'Reilly Radar.
+
+Note that in the future, we intend to add additional tables beyond the current 
set, for things like runtime characterstics (e.g. at-least-once vs 
exactly-once), performance, etc.
+
+{% include capability-matrix-common.md %}
+{% assign cap-data=site.data.capability-matrix %}
+
+<!-- Summary table -->
+{% assign cap-style='cap-summary' %}
+{% assign cap-view='summary' %}
+{% assign cap-other-view='full' %}
+{% assign cap-toggle-details=1 %}
+{% assign cap-display='none' %}
+
+{% include capability-matrix.md %}
+
+<!-- Full details table -->
+{% assign cap-style='cap' %}
+{% assign cap-view='full' %}
+{% assign cap-other-view='summary' %}
+{% assign cap-toggle-details=0 %}
+{% assign cap-display='block' %}
+
+{% include capability-matrix.md %}

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_posts/2016-02-22-beam-has-a-logo.markdown
----------------------------------------------------------------------
diff --git a/_posts/2016-02-22-beam-has-a-logo.markdown 
b/_posts/2016-02-22-beam-has-a-logo.markdown
index e5e928e..f643a36 100644
--- a/_posts/2016-02-22-beam-has-a-logo.markdown
+++ b/_posts/2016-02-22-beam-has-a-logo.markdown
@@ -4,7 +4,8 @@ title:  "Apache Beam has a logo!"
 date:   2016-02-22 10:21:48 -0800
 excerpt_separator: <!--more-->
 categories: beam update website
-author: jamesmalone
+authors: 
+- jamesmalone
 ---
 
 One of the major benefits of Apache Beam is the fact that it unifies both

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_posts/2016-02-22-beam-has-a-logo0.markdown
----------------------------------------------------------------------
diff --git a/_posts/2016-02-22-beam-has-a-logo0.markdown 
b/_posts/2016-02-22-beam-has-a-logo0.markdown
index adf9599..726433c 100644
--- a/_posts/2016-02-22-beam-has-a-logo0.markdown
+++ b/_posts/2016-02-22-beam-has-a-logo0.markdown
@@ -4,7 +4,8 @@ title:  "Dataflow Python SDK is now public!"
 date:   2016-02-25 13:00:00 -0800
 excerpt_separator: <!--more-->
 categories: beam python sdk
-author: jamesmalone
+authors:
+- jamesmalone
 ---
 
 When the Apache Beam project proposed entry into the [Apache 
Incubator](http://wiki.apache.org/incubator/BeamProposal) the proposal

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_posts/2016-03-17-compatability-matrix.md
----------------------------------------------------------------------
diff --git a/_posts/2016-03-17-compatability-matrix.md 
b/_posts/2016-03-17-compatability-matrix.md
new file mode 100644
index 0000000..df4df31
--- /dev/null
+++ b/_posts/2016-03-17-compatability-matrix.md
@@ -0,0 +1,596 @@
+---
+layout: post
+title:  "Clarifying & Formalizing Runner Capabilities"
+date:   2016-03-17 11:00:00 -0700
+excerpt_separator: <!--more-->
+categories: beam compatibility
+authors:
+  - fjp
+  - takidau
+
+capability-matrix-snapshot:
+  columns:
+    - class: model
+      name: Beam Model
+    - class: dataflow
+      name: Google Cloud Dataflow
+    - class: flink
+      name: Apache Flink
+    - class: spark
+      name: Apache Spark
+  categories:
+    - description: What is being computed?
+      anchor: what
+      color-b: 'ca1'
+      color-y: 'ec3'
+      color-p: 'fe5'
+      color-n: 'ddd'
+      rows:
+        - name: ParDo
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: element-wise processing
+              l3: Element-wise transformation parameterized by a chunk of user 
code. Elements are processed in bundles, with initialization and termination 
hooks. Bundle size is chosen by the runner and cannot be controlled by user 
code. ParDo processes a main input PCollection one element at a time, but 
provides side input access to additional PCollections.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: Batch mode uses large bundle sizes. Streaming uses smaller 
bundle sizes.
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: ParDo itself, as per-element transformation with UDFs, is 
fully supported by Flink for both batch and streaming.
+            - class: spark
+              l1: 'Yes'
+              l2: fully supported
+              l3: ParDo applies per-element transformations as Spark 
FlatMapFunction.
+        - name: GroupByKey
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: key grouping
+              l3: Grouping of key-value pairs per key, window, and pane. (See 
also other tabs.)
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "Uses Flink's keyBy for key grouping. When grouping by 
window in streaming (creating the panes) the Flink runner uses the Beam code. 
This guarantees support for all windowing and triggering mechanisms."
+            - class: spark
+              l1: 'Partially'
+              l2: group by window in batch only
+              l3: "Uses Spark's groupByKey for grouping. Grouping by window is 
currently only supported in batch."
+        - name: Flatten
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: collection concatenation
+              l3: Concatenates multiple homogenously typed collections 
together.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: ''
+            - class: spark
+              l1: 'Yes'
+              l2: fully supported
+              l3: ''
+              
+        - name: Combine
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: associative &amp; commutative aggregation
+              l3: 'Application of an associative, commutative operation over 
all values ("globally") or over all values associated with each key ("per 
key"). Can be implemented using ParDo, but often more efficient implementations 
exist.'
+            - class: dataflow
+              l1: 'Yes'
+              l2: 'efficient execution'
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: 'fully supported'
+              l3: Uses a combiner for pre-aggregation for batch and streaming.
+            - class: spark
+              l1: 'Yes'
+              l2: fully supported
+              l3: Supports GroupedValues, Globally and PerKey.
+
+        - name: Composite Transforms
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: user-defined transformation subgraphs
+              l3: Allows easy extensibility for library writers.  In the near 
future, we expect there to be more information provided at this level -- 
customized metadata hooks for monitoring, additional runtime/environment hooks, 
etc.
+            - class: dataflow
+              l1: 'Partially'
+              l2: supported via inlining
+              l3: Currently composite transformations are inlined during 
execution. The structure is later recreated from the names, but other transform 
level information (if added to the model) will be lost.
+            - class: flink
+              l1: 'Partially'
+              l2: supported via inlining
+              l3: ''
+            - class: spark
+              l1: 'Partially'
+              l2: supported via inlining
+              l3: ''
+
+        - name: Side Inputs
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: additional elements available during DoFn execution
+              l3: Side inputs are additional <tt>PCollections</tt> whose 
contents are computed during pipeline execution and then made accessible to 
DoFn code. The exact shape of the side input depends both on the 
<tt>PCollectionView</tt> used to describe the access pattern (interable, map, 
singleton) and the window of the element from the main input that is currently 
being processed.
+            - class: dataflow
+              l1: 'Yes'
+              l2: some size restrictions in streaming
+              l3: Batch implemented supports a distributed implementation, but 
streaming mode may force some size restrictions. Neither mode is able to push 
lookups directly up into key-based sources.
+            - class: flink
+              jira: BEAM-102
+              l1: 'Partially'
+              l2: no supported in streaming
+              l3: Supported in batch. Side inputs for streaming are currently 
WiP.
+            - class: spark
+              l1: 'Partially'
+              l2: not supported in streaming
+              l3: "Side input is actually a broadcast variable in Spark so it 
can't be updated during the life of a job. Spark-runner implementation of side 
input is more of an immutable, static, side input."
+
+        - name: Source API
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: user-defined sources
+              l3: Allows users to provide additional input sources. Supports 
both bounded and unbounded data. Includes hooks necessary to provide efficient 
parallelization (size estimation, progress information, dynamic splitting, etc).
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: 
+            - class: flink
+              jira: BEAM-103
+              l1: 'Partially'
+              l2: parallelism 1 in streaming
+              l3: Fully supported in batch. In streaming, sources currently 
run with parallelism 1.
+            - class: spark
+              l1: 'Yes'
+              l2: fully supported
+              l3: 
+              
+        - name: Aggregators
+          values:
+            - class: model
+              l1: 'Partially'
+              l2: user-provided metrics
+              l3: Allow transforms to aggregate simple metrics across bundles 
in a <tt>DoFn</tt>. Semantically equivalent to using a side output, but support 
partial results as the transform executes. Will likely want to augment 
<tt>Aggregators</tt> to be more useful for processing unbounded data by making 
them windowed.
+            - class: dataflow
+              l1: 'Partially'
+              l2: may miscount in streaming mode
+              l3: Current model is fully supported in batch mode. In streaming 
mode, <tt>Aggregators</tt> may under or overcount when bundles are retried.
+            - class: flink
+              l1: 'Partially'
+              l2: may undercount in streaming
+              l3: Current model is fully supported in batch. In streaming 
mode, <tt>Aggregators</tt> may undercount.
+            - class: spark
+              l1: 'Partially'
+              l2: streaming requires more testing
+              l3: "Uses Spark's <tt>AccumulatorParam</tt> mechanism"
+
+        - name: Keyed State
+          values:
+            - class: model
+              jira: BEAM-25
+              l1: 'No'
+              l2: storage per key, per window
+              l3: Allows fine-grained access to per-key, per-window persistent 
state. Necessary for certain use cases (e.g. high-volume windows which store 
large amounts of data, but typically only access small portions of it; complex 
state machines; etc.) that are not easily or efficiently addressed via 
<tt>Combine</tt> or <tt>GroupByKey</tt>+<tt>ParDo</tt>. 
+            - class: dataflow
+              l1: 'No'
+              l2: pending model support
+              l3: Dataflow already supports keyed state internally, so adding 
support for this should be easy once the Beam model exposes it.
+            - class: flink
+              l1: 'No'
+              l2: pending model support
+              l3: Flink already supports keyed state, so adding support for 
this should be easy once the Beam model exposes it.
+            - class: spark
+              l1: 'No'
+              l2: pending model support
+              l3: Spark supports keyed state with mapWithState() so support 
shuold be straight forward.
+              
+              
+    - description: Where in event time?
+      anchor: where
+      color-b: '37d'
+      color-y: '59f'
+      color-p: '8cf'
+      color-n: 'ddd'
+      rows:
+        - name: Global windows
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: all time
+              l3: The default window which covers all of time. (Basically how 
traditional batch cases fit in the model.)
+            - class: dataflow
+              l1: 'Yes'
+              l2: default
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+            - class: spark
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+              
+        - name: Fixed windows
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: periodic, non-overlapping
+              l3: Fixed-size, timestamp-based windows. (Hourly, Daily, etc)
+            - class: dataflow
+              l1: 'Yes'
+              l2: built-in
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+            - class: spark
+              l1: Partially
+              l2: currently only supported in batch
+              l3: ''
+              
+        - name: Sliding windows
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: periodic, overlapping
+              l3: Possibly overlapping fixed-size timestamp-based windows 
(Every minute, use the last ten minutes of data.)
+            - class: dataflow
+              l1: 'Yes'
+              l2: built-in
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+
+        - name: Session windows
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: activity-based
+              l3: Based on bursts of activity separated by a gap size. 
Different per key.
+            - class: dataflow
+              l1: 'Yes'
+              l2: built-in
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+            - class: spark
+              l1: 'No'
+              l2: pending Spark engine support
+              l3: ''
+
+        - name: Custom windows
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: user-defined windows
+              l3: All windows must implement <tt>BoundedWindow</tt>, which 
specifies a max timestamp. Each <tt>WindowFn</tt> assigns elements to an 
associated window.
+            - class: dataflow
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+            - class: spark
+              l1: 'No'
+              l2: pending Spark engine support
+              l3: ''
+
+        - name: Custom merging windows
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: user-defined merging windows
+              l3: A custom <tt>WindowFn</tt> additionally specifies whether 
and how to merge windows.
+            - class: dataflow
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+            - class: spark
+              l1: 'No'
+              l2: pending Spark engine support
+              l3: ''
+
+        - name: Timestamp control
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: output timestamp for window panes
+              l3: For a grouping transform, such as GBK or Combine, an 
OutputTimeFn specifies (1) how to combine input timestamps within a window and 
(2) how to merge aggregated timestamps when windows merge.
+            - class: dataflow
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+            - class: spark
+              l1: 'No'
+              l2: pending Spark engine support
+              l3: ''
+
+
+              
+    - description: When in processing time?
+      anchor: when
+      color-b: '6a4'
+      color-y: '8c6'
+      color-p: 'ae8'
+      color-n: 'ddd'
+      rows:
+        
+        - name: Configurable triggering
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: user customizable
+              l3: Triggering may be specified by the user (instead of simply 
driven by hardcoded defaults).
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: Fully supported in streaming mode. In batch mode, 
intermediate trigger firings are effectively meaningless.
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+
+        - name: Event-time triggers
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: relative to event time
+              l3: Triggers that fire in response to event-time completeness 
signals, such as watermarks progressing.
+            - class: dataflow
+              l1: 'Yes'
+              l2: yes in streaming, fixed granularity in batch
+              l3: Fully supported in streaming mode. In batch mode, currently 
watermark progress jumps from the beginning of time to the end of time once the 
input has been fully consumed, thus no additional triggering granularity is 
available.
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+              
+        - name: Processing-time triggers
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: relative to processing time
+              l3: Triggers that fire in response to processing-time advancing.
+            - class: dataflow
+              l1: 'Yes'
+              l2: yes in streaming, fixed granularity in batch
+              l3: Fully supported in streaming mode. In batch mode, from the 
perspective of triggers, processing time currently jumps from the beginning of 
time to the end of time once the input has been fully consumed, thus no 
additional triggering granularity is available.
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+            - class: spark
+              l1: 'Yes'
+              l2: "This is Spark streaming's native model"
+              l3: "Spark processes streams in micro-batches. The micro-batch 
size is actually a pre-set, fixed, time interval. Currently, the runner takes 
the first window size in the pipeline and sets it's size as the batch interval. 
Any following window operations will be considered processing time windows and 
will affect triggering."
+              
+        - name: Count triggers
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: every N elements
+              l3: Triggers that fire after seeing at least N elements.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: Fully supported in streaming mode. In batch mode, elements 
are processed in the largest bundles possible, so count-based triggers are 
effectively meaningless.
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+
+        - name: '[Meta]data driven triggers'
+          values:
+            - class: model
+              jira: BEAM-101
+              l1: 'No'
+              l2: in response to data
+              l3: Triggers that fire in response to attributes of the data 
being processed.
+            - class: dataflow
+              l1: 'No'
+              l2: pending model support
+              l3: 
+            - class: flink
+              l1: 'No'
+              l2: pending model support
+              l3: 
+            - class: spark
+              l1: 'No'
+              l2: pending model support
+              l3: 
+
+        - name: Composite triggers
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: compositions of one or more sub-triggers
+              l3: Triggers which compose other triggers in more complex 
structures, such as logical AND, logical OR, early/on-time/late, etc.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+              
+        - name: Allowed lateness
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: event-time bound on window lifetimes
+              l3: A way to bound the useful lifetime of a window (in event 
time), after which any unemitted results may be materialized, the window 
contents may be garbage collected, and any addtional late data that arrive for 
the window may be discarded.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: Fully supported in streaming mode. In batch mode no data is 
ever late.
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+              
+        - name: Timers
+          values:
+            - class: model
+              jira: BEAM-27
+              l1: 'No'
+              l2: delayed processing callbacks
+              l3: A fine-grained mechanism for performing work at some point 
in the future, in either the event-time or processing-time domain. Useful for 
orchestrating delayed events, timeouts, etc in complex state per-key, 
per-window state machines.
+            - class: dataflow
+              l1: 'No'
+              l2: pending model support
+              l3: Dataflow already supports timers internally, so adding 
support for this should be easy once the Beam model exposes it.
+            - class: flink
+              l1: 'No'
+              l2: pending model support
+              l3: Flink already supports timers internally, so adding support 
for this should be easy once the Beam model exposes it.
+            - class: spark
+              l1: 'No'
+              l2: pending model support
+              l3: ''
+              
+              
+    - description: How do refinements relate?
+      anchor: how
+      color-b: 'b55'
+      color-y: 'd77'
+      color-p: 'faa'
+      color-n: 'ddd'
+      rows:
+        
+        - name: Discarding
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: panes discard elements when fired
+              l3: Elements are discarded from accumulated state as their pane 
is fired.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+            - class: spark
+              l1: 'Yes'
+              l2: fully supported
+              l3: 'Spark streaming natively discards elements after firing.'
+              
+        - name: Accumulating
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: panes accumulate elements across firings
+              l3: Elements are accumulated in state across multiple pane 
firings for the same window.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: Requires that the accumulated pane fits in memory, after 
being passed through the combiner (if relevant)
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and 
code."
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+              
+        - name: 'Accumulating &amp; Retracting'
+          values:
+            - class: model
+              jira: BEAM-91
+              l1: 'No'
+              l2: accumulation plus retraction of old panes
+              l3: Elements are accumulated across multiple pane firings and 
old emitted values are retracted. Also known as "backsies" ;-D
+            - class: dataflow
+              l1: 'No'
+              l2: pending model support
+              l3: ''
+            - class: flink
+              l1: 'No'
+              l2: pending model support
+              l3: ''
+            - class: spark
+              l1: 'No'
+              l2: pending model support
+              l3: ''
+              
+
+---
+
+With initial code drops complete ([Dataflow SDK and 
Runner](https://github.com/apache/incubator-beam/pull/1), [Flink 
Runner](https://github.com/apache/incubator-beam/pull/12), [Spark 
Runner](https://github.com/apache/incubator-beam/pull/42)) and expressed 
interest in runner implementations for 
[Storm](https://issues.apache.org/jira/browse/BEAM-9), 
[Hadoop](https://issues.apache.org/jira/browse/BEAM-19), and 
[Gearpump](https://issues.apache.org/jira/browse/BEAM-79) (amongst others), we 
wanted to start addressing a big question in the Apache Beam (incubating) 
community: what capabilities will each runner be able to support?
+
+While weâd love to have a world where all runners support the full suite of 
semantics included in the Beam Model (formerly referred to as the [Dataflow 
Model](http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf)), practically speaking, 
there will always be certain features that some runners canât provide. For 
example, a Hadoop-based runner would be inherently batch-based and may be 
unable to (easily) implement support for unbounded collections. However, that 
doesnât prevent it from being extremely useful for a large set of uses. In 
other cases, the implementations provided by one runner may have slightly 
different semantics that those provided by another (e.g. even though the 
current suite of runners all support exactly-once delivery guarantees, an 
[Apache Samza](http://samza.apache.org/) runner, which would be a welcome 
addition, would currently only support at-least-once).
+
+To help clarify things, weâve been working on enumerating the key features 
of the Beam model in a [capability matrix]({{ site.baseurl 
}}/capability-matrix/) for all existing runners, categorized around the four 
key questions addressed by the model: <span class="wwwh-what-dark">What</span> 
/ <span class="wwwh-where-dark">Where</span> / <span 
class="wwwh-when-dark">When</span> / <span class="wwwh-how-dark">How</span> (if 
youâre not familiar with those questions, you might want to read through 
[Streaming 102](http://oreilly.com/ideas/the-world-beyond-batch-streaming-102) 
for an overview). This table will be maintained over time as the model evolves, 
our understanding grows, and runners are created or features added.
+
+Included below is a summary snapshot of our current understanding of the 
capabilities of the existing runners (see the [live version]({{ site.baseurl 
}}/capability-matrix/) for full details, descriptions, and Jira links); since 
integration is still under way, the system as whole isnât yet in a completely 
stable, usable state. But that should be changing in the near future, and 
weâll be updating loud and clear on this blog when the first supported Beam 
1.0 release happens.
+
+In the meantime, these tables should help clarify where we expect to be in the 
very near term, and help guide expectations about what existing runners are 
capable of, and what features runner implementers will be tackling next.
+
+{% include capability-matrix-common.md %}
+{% assign cap-data=page.capability-matrix-snapshot %}
+
+<!-- Summary table -->
+{% assign cap-style='cap-summary' %}
+{% assign cap-view='blog' %}
+{% assign cap-other-view='full' %}
+{% assign cap-toggle-details=1 %}
+{% assign cap-display='block' %}
+
+{% include capability-matrix.md %}

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/_sass/capability-matrix.scss
----------------------------------------------------------------------
diff --git a/_sass/capability-matrix.scss b/_sass/capability-matrix.scss
new file mode 100644
index 0000000..3b0638a
--- /dev/null
+++ b/_sass/capability-matrix.scss
@@ -0,0 +1,127 @@
+/* What/Where/When/How colors. */
+.wwwh-what-dark {
+    color:#ca1;
+    font-weight:bold;
+    font-style:italic;
+}
+
+.wwwh-where-dark {
+    color:#37d;
+    font-weight:bold;
+    font-style:italic;
+}
+
+.wwwh-when-dark {
+    color:#6a4;
+    font-weight:bold;
+    font-style:italic;
+}
+
+.wwwh-how-dark {
+    color:#b55;
+    font-weight:bold;
+    font-style:italic;
+}
+
+/* Capability matrix general sizing, alignment etc. */
+table.cap {
+    border-spacing: 0px;
+    border-collapse: collapse;
+    padding: 2px;
+}
+
+td.cap {
+    border-width: 1px;
+    border-style:solid;
+    vertical-align:text-top;
+    padding:0.5ex;
+}
+
+th.cap, tr.cap, table.cap {
+    border-width: 1px;
+    border-style:solid;
+    vertical-align:text-top;
+    padding:0.5ex;
+}
+
+td.cap-blank {
+    padding:10px;
+}
+
+/* Capability matrix blog-post sizing, alignment etc. */
+table.cap-summary {
+    border-spacing: 0px;
+    border-collapse: collapse;
+    padding: 2px;
+    width:600px;
+}
+
+td.cap-summary {
+    border-width: 1px;
+    border-style:solid;
+    vertical-align:text-top;
+    padding:0.5ex;
+}
+
+th.cap-summary, tr.cap-summary, table.cap-summary {
+    border-width: 1px;
+    border-style:solid;
+    vertical-align:text-top;
+    padding:0.5ex;
+}
+
+td.cap-summary-blank {
+    padding:10px;
+}
+
+/* Capability matrix semantic coloring. */
+th.color-metadata, td.color-metadata {
+    background-color:#fff;
+    border-color:#fff;
+    color:#000;
+}
+
+th.color-capability {
+    background-color:#333;
+    border-color:#222;
+}
+
+th.color-platform {
+    background-color:#333;
+    border-color:#222;
+}
+
+td.color-blank {
+    background-color:#fff;
+    color:#fff;
+}
+
+/* Capability matrix semantic formatting */
+th.format-category {
+    vertical-align:text-top;
+    font-size:20px;
+    text-align:center;
+}
+
+th.format-capability {
+    text-align:right;
+    white-space:nowrap;
+}
+
+th.format-platform {
+    text-align:center;
+}
+
+/* Capability matrix expand/collapse details toggle. */
+div.cap-toggle {
+    border-color:#000;
+    color:#000;
+    padding-top:1.5ex;
+    border-style:solid;
+    border-width:0px;
+    text-align:center;
+    cursor:pointer;
+    position:absolute;
+    font-size:12px;
+    font-weight:normal;
+}

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/content/beam/python/sdk/2016/02/25/beam-has-a-logo0.html
----------------------------------------------------------------------
diff --git a/content/beam/python/sdk/2016/02/25/beam-has-a-logo0.html 
b/content/beam/python/sdk/2016/02/25/beam-has-a-logo0.html
index 14d436b..aa23635 100644
--- a/content/beam/python/sdk/2016/02/25/beam-has-a-logo0.html
+++ b/content/beam/python/sdk/2016/02/25/beam-has-a-logo0.html
@@ -44,6 +44,7 @@
           <a href="#" class="dropdown-toggle" data-toggle="dropdown" 
role="button" aria-haspopup="true" aria-expanded="false">Documentation <span 
class="caret"></span></a>
           <ul class="dropdown-menu">
             <li><a href="/getting_started/">Getting Started</a></li>
+           <li><a href="/capability-matrix/">Capability Matrix</a></li>
             <li><a href="https://goo.gl/ps8twC";>Technical Docs</a></li>
             <li><a href="https://goo.gl/nk5OM0";>Technical Vision</a></li>
           </ul>
@@ -75,11 +76,14 @@
     <div class="container" role="main">
 
       <div class="container">
-        <article class="post" itemscope 
itemtype="http://schema.org/BlogPosting";>
+        
+
+<article class="post" itemscope itemtype="http://schema.org/BlogPosting";>
 
   <header class="post-header">
     <h1 class="post-title" itemprop="name headline">Dataflow Python SDK is now 
public!</h1>
-    <p class="post-meta"><time datetime="2016-02-25T22:00:00+01:00" 
itemprop="datePublished">Feb 25, 2016</time> â¢ <span itemprop="author" 
itemscope itemtype="http://schema.org/Person";><span 
itemprop="name">jamesmalone</span></span></p>
+    <p class="post-meta"><time datetime="2016-02-25T13:00:00-08:00" 
itemprop="datePublished">Feb 25, 2016</time> â¢  James Malone [<a 
href="https://twitter.com/chimerasaurus";>@chimerasaurus</a>]
+</p>
   </header>
 
   <div class="post-content" itemprop="articleBody">

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/931d7f51/content/beam/update/website/2016/02/22/beam-has-a-logo.html
----------------------------------------------------------------------
diff --git a/content/beam/update/website/2016/02/22/beam-has-a-logo.html 
b/content/beam/update/website/2016/02/22/beam-has-a-logo.html
index 4004673..d2ce9bc 100644
--- a/content/beam/update/website/2016/02/22/beam-has-a-logo.html
+++ b/content/beam/update/website/2016/02/22/beam-has-a-logo.html
@@ -44,6 +44,7 @@
           <a href="#" class="dropdown-toggle" data-toggle="dropdown" 
role="button" aria-haspopup="true" aria-expanded="false">Documentation <span 
class="caret"></span></a>
           <ul class="dropdown-menu">
             <li><a href="/getting_started/">Getting Started</a></li>
+           <li><a href="/capability-matrix/">Capability Matrix</a></li>
             <li><a href="https://goo.gl/ps8twC";>Technical Docs</a></li>
             <li><a href="https://goo.gl/nk5OM0";>Technical Vision</a></li>
           </ul>
@@ -75,11 +76,14 @@
     <div class="container" role="main">
 
       <div class="container">
-        <article class="post" itemscope 
itemtype="http://schema.org/BlogPosting";>
+        
+
+<article class="post" itemscope itemtype="http://schema.org/BlogPosting";>
 
   <header class="post-header">
     <h1 class="post-title" itemprop="name headline">Apache Beam has a 
logo!</h1>
-    <p class="post-meta"><time datetime="2016-02-22T19:21:48+01:00" 
itemprop="datePublished">Feb 22, 2016</time> â¢ <span itemprop="author" 
itemscope itemtype="http://schema.org/Person";><span 
itemprop="name">jamesmalone</span></span></p>
+    <p class="post-meta"><time datetime="2016-02-22T10:21:48-08:00" 
itemprop="datePublished">Feb 22, 2016</time> â¢  James Malone [<a 
href="https://twitter.com/chimerasaurus";>@chimerasaurus</a>]
+</p>
   </header>
 
   <div class="post-content" itemprop="articleBody">
@@ -99,7 +103,7 @@ now has a logo.</p>
 unification of bath and streaming, as beams of light, within the âBâ. We 
will base
 our future website and documentation design around this logo and its coloring. 
We
 will also make various permutations and resolutions of this logo available in 
the
-coming weeks. For any questions or comments, send an email to the 
<code>dev@</code> email list
+coming weeks. For any questions or comments, send an email to the <code 
class="highlighter-rouge">dev@</code> email list
 for Apache Beam.</p>
 
   </div>

[2/3] incubator-beam-site git commit: Capability matrix page + blog post: - Content as discussed in this thread: http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201603.mbox/%3CCAB8MnHW6mE3GXvemDStCvn_1zMxqXj0ZWLJBgO9hNuHed9ue%2Bw%40mail.g

Reply via email to