Repository: beam-site
Updated Branches:
  refs/heads/asf-site cb6d7d77e -> b5748765f


Add Pipeline I/O section to website - outline + move some existing content

* I did not to go with a single page for all this content b/c both java and 
python have enough unique content that they deserve their own separate sections 
(ie, just tabs on the code isn't enough), and the "click to the next page" 
model currently implemented allows the user to pick java vs python, but then 
after reading those pages, the next page for both points at the same place - 
the users mostly follow the same path, but for java vs python specific content, 
they will diverge then converge again.
* I moved the "list of built-in I/O" content over to it's own separate page 
since it'd be nice to have more content there - e.g. capabilities matrix, and 
it felt special enough to pull out of the programming guide.
* We decided not to put all of this content in the contribute section of the 
site since the expectation is we don't think all users will contribute their IO 
transforms, so we want most of the docs to just be about writing an IO 
transforms, and they lay out the expectations in the contribute part of the IO 
section.


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/f2171885
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/f2171885
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/f2171885

Branch: refs/heads/asf-site
Commit: f21718850c645c83767f9787d335964da142fda9
Parents: cb6d7d7
Author: Stephen Sisk <[email protected]>
Authored: Wed Mar 8 17:49:37 2017 -0800
Committer: Davor Bonaci <[email protected]>
Committed: Fri Mar 17 18:33:43 2017 -0700

----------------------------------------------------------------------
 src/_includes/header.html                  |  1 +
 src/documentation/io/authoring-java.md     | 15 ++++++
 src/documentation/io/authoring-overview.md | 44 ++++++++++++++++++
 src/documentation/io/authoring-python.md   | 18 ++++++++
 src/documentation/io/built-in.md           | 61 +++++++++++++++++++++++++
 src/documentation/io/contributing.md       | 15 ++++++
 src/documentation/io/io-toc.md             | 26 +++++++++++
 src/documentation/io/testing.md            | 19 ++++++++
 src/documentation/programming-guide.md     | 54 ++--------------------
 src/documentation/sdks/java.md             | 21 +--------
 10 files changed, 204 insertions(+), 70 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/beam-site/blob/f2171885/src/_includes/header.html
----------------------------------------------------------------------
diff --git a/src/_includes/header.html b/src/_includes/header.html
index 28000d8..1ea3496 100644
--- a/src/_includes/header.html
+++ b/src/_includes/header.html
@@ -42,6 +42,7 @@
               <li><a href="{{ site.baseurl 
}}/documentation/pipelines/design-your-pipeline/">Design Your Pipeline</a></li>
               <li><a href="{{ site.baseurl 
}}/documentation/pipelines/create-your-pipeline/">Create Your Pipeline</a></li>
               <li><a href="{{ site.baseurl 
}}/documentation/pipelines/test-your-pipeline/">Test Your Pipeline</a></li>
+              <li><a href="{{ site.baseurl 
}}/documentation/io/io-toc/">Pipeline I/O</a></li>
               <li role="separator" class="divider"></li>
                          <li class="dropdown-header">SDKs</li>
                          <li><a href="{{ site.baseurl 
}}/documentation/sdks/java/">Java SDK</a></li>

http://git-wip-us.apache.org/repos/asf/beam-site/blob/f2171885/src/documentation/io/authoring-java.md
----------------------------------------------------------------------
diff --git a/src/documentation/io/authoring-java.md 
b/src/documentation/io/authoring-java.md
new file mode 100644
index 0000000..6cdb6bd
--- /dev/null
+++ b/src/documentation/io/authoring-java.md
@@ -0,0 +1,15 @@
+---
+layout: default
+title: "Authoring I/O Transforms - Java"
+permalink: /documentation/io/authoring-java/
+---
+
+[Pipeline I/O Table of Contents]({{site.baseurl}}/documentation/io/io-toc/)
+
+# Authoring I/O Transforms - Java
+
+> Note: This guide is still in progress. There is an open issue to finish the 
guide: [BEAM-1025](https://issues.apache.org/jira/browse/BEAM-1025).
+
+# Next steps
+
+[Testing I/O Transforms]({{site.baseurl }}/documentation/io/testing/)

http://git-wip-us.apache.org/repos/asf/beam-site/blob/f2171885/src/documentation/io/authoring-overview.md
----------------------------------------------------------------------
diff --git a/src/documentation/io/authoring-overview.md 
b/src/documentation/io/authoring-overview.md
new file mode 100644
index 0000000..dab6a85
--- /dev/null
+++ b/src/documentation/io/authoring-overview.md
@@ -0,0 +1,44 @@
+---
+layout: default
+title: "Authoring I/O Transforms - Overview"
+permalink: /documentation/io/authoring-overview/
+---
+
+[Pipeline I/O Table of Contents]({{site.baseurl}}/documentation/io/io-toc/)
+
+# Authoring I/O Transforms - Overview
+
+_A guide for users who need to connect to a data store that isn't supported by 
the [Built-in I/O Transforms]({{site.baseurl }}/documentation/io/built-in/)_
+
+> Note: This guide is still in progress. There is an open issue to finish the 
guide: [BEAM-1025](https://issues.apache.org/jira/browse/BEAM-1025).
+
+* TOC
+{:toc}
+
+## Introduction
+TODO
+
+## Example I/O Transforms
+TODO
+
+## Suggested steps for implementers
+TODO
+
+## Read transforms
+TODO
+
+### When to implement using the Source API
+TODO
+
+## Write transforms
+TODO
+
+### When to implement using the Sink API
+TODO
+
+# Next steps
+
+For more details on actual implementation, continue with one of the the 
language specific guides:
+
+* [Authoring I/O Transforms - Python]({{site.baseurl 
}}/documentation/io/authoring-python/)
+* [Authoring I/O Transforms - Java]({{site.baseurl 
}}/documentation/io/authoring-java/)

http://git-wip-us.apache.org/repos/asf/beam-site/blob/f2171885/src/documentation/io/authoring-python.md
----------------------------------------------------------------------
diff --git a/src/documentation/io/authoring-python.md 
b/src/documentation/io/authoring-python.md
new file mode 100644
index 0000000..b6ccc56
--- /dev/null
+++ b/src/documentation/io/authoring-python.md
@@ -0,0 +1,18 @@
+---
+layout: default
+title: "Authoring I/O Transforms - Python"
+permalink: /documentation/io/authoring-python/
+---
+
+[Pipeline I/O Table of Contents]({{site.baseurl}}/documentation/io/io-toc/)
+
+# Authoring I/O Transforms - Python
+
+> Note: This guide is still in progress. There is an open issue to finish the 
guide: [BEAM-1025](https://issues.apache.org/jira/browse/BEAM-1025).
+
+TODO - move in the [current python SDK 
content]({{site.baseurl}}/documentation/sdks/python-custom-io/)
+
+
+# Next steps
+
+[Testing I/O Transforms]({{site.baseurl}}/documentation/io/testing/)

http://git-wip-us.apache.org/repos/asf/beam-site/blob/f2171885/src/documentation/io/built-in.md
----------------------------------------------------------------------
diff --git a/src/documentation/io/built-in.md b/src/documentation/io/built-in.md
new file mode 100644
index 0000000..9f96968
--- /dev/null
+++ b/src/documentation/io/built-in.md
@@ -0,0 +1,61 @@
+---
+layout: default
+title: "Built-in I/O Transforms"
+permalink: /documentation/io/built-in/
+---
+
+[Pipeline I/O Table of Contents]({{site.baseurl}}/documentation/io/io-toc/)
+
+# Built-in I/O Transforms
+
+This table contains the currently available I/O transforms.
+
+Consult the [Programming Guide I/O section]({{site.baseurl 
}}/documentation/programming-guide#io) for general usage instructions, and see 
the javadoc/pydoc for the particular I/O transforms.
+
+
+<table class="table table-bordered">
+<tr>
+  <th>Language</th>
+  <th>File-based</th>
+  <th>Messaging</th>
+  <th>Database</th>
+</tr>
+<tr>
+  <td>Java</td>
+  <td>
+    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java";>AvroIO</a></p>
+    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/hdfs";>Apache 
Hadoop HDFS</a></p>
+    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java";>TextIO</a></p>
+    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/";>XML</a></p>
+  </td>
+  <td>
+    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/jms";>JMS</a></p>
+    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/kafka";>Apache 
Kafka</a></p>
+    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/kinesis";>Amazon 
Kinesis</a></p>
+    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io";>Google
 Cloud PubSub</a></p>
+  </td>
+  <td>
+    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/hbase";>Apache 
HBase</a></p>
+    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/mongodb";>MongoDB</a></p>
+    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/jdbc";>JDBC</a></p>
+    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery";>Google
 BigQuery</a></p>
+    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable";>Google
 Cloud Bigtable</a></p>
+    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore";>Google
 Cloud Datastore</a></p>
+  </td>
+</tr>
+<tr>
+  <td>Python</td>
+  <td>
+    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/avroio.py";>avroio</a></p>
+    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/textio.py";>textio</a></p>
+  </td>
+  <td>
+  </td>
+  <td>
+    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py";>Google
 BigQuery</a></p>
+    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/python/apache_beam/io/gcp/datastore";>Google
 Cloud Datastore</a></p>
+  </td>
+
+</tr>
+</table>
+

http://git-wip-us.apache.org/repos/asf/beam-site/blob/f2171885/src/documentation/io/contributing.md
----------------------------------------------------------------------
diff --git a/src/documentation/io/contributing.md 
b/src/documentation/io/contributing.md
new file mode 100644
index 0000000..949db3c
--- /dev/null
+++ b/src/documentation/io/contributing.md
@@ -0,0 +1,15 @@
+---
+layout: default
+title: "Contributing I/O Transforms"
+permalink: /documentation/io/contributing/
+---
+
+[Pipeline I/O Table of Contents]({{site.baseurl}}/documentation/io/io-toc/)
+
+# Contributing I/O Transforms
+
+* If you are planning to contribute your I/O transform to the Apache Beam 
community, you'll be going through the normal Beam contribution life cycle - 
see the [Apache Beam Contribution Guide]({{ site.baseurl 
}}/contribute/contribution-guide/) for more details.
+* Talk to the community!
+* Make sure you've implemented the appropriate tests as discussed in the 
[Testing I/O Transforms]({{site.baseurl }}/documentation/io/testing/) section.
+
+> Note: This guide is still in progress. There is an open issue to finish the 
guide: [BEAM-1025](https://issues.apache.org/jira/browse/BEAM-1025).

http://git-wip-us.apache.org/repos/asf/beam-site/blob/f2171885/src/documentation/io/io-toc.md
----------------------------------------------------------------------
diff --git a/src/documentation/io/io-toc.md b/src/documentation/io/io-toc.md
new file mode 100644
index 0000000..ec6b244
--- /dev/null
+++ b/src/documentation/io/io-toc.md
@@ -0,0 +1,26 @@
+---
+layout: default
+title: "Pipeline I/O"
+permalink: /documentation/io/io-toc/
+---
+
+# Pipeline I/O
+
+## Using Pipeline I/O
+* [Programming Guide: Using I/O Transforms]({{site.baseurl 
}}/documentation/programming-guide#io)
+* [Built-in I/O Transforms]({{site.baseurl }}/documentation/io/built-in/)
+
+
+## Authoring Read &amp; Write I/O Transforms
+
+> Note: This guide is still in progress. There is an open issue to finish the 
guide: [BEAM-1025](https://issues.apache.org/jira/browse/BEAM-1025).
+
+<!-- TODO: commented out until this content is ready.
+
+This series of articles will walk you through the process of creating a new 
I/O transform. 
+
+* [Authoring I/O Transforms - Overview]({{site.baseurl 
}}/documentation/io/authoring-overview/)
+* [Authoring I/O Transforms - Python]({{site.baseurl 
}}/documentation/io/authoring-python/)
+* [Authoring I/O Transforms - Java]({{site.baseurl 
}}/documentation/io/authoring-java/)
+* [Testing I/O Transforms]({{site.baseurl }}/documentation/io/testing/)
+* [Contributing I/O Transforms]({{site.baseurl 
}}/documentation/io/contributing/) -->

http://git-wip-us.apache.org/repos/asf/beam-site/blob/f2171885/src/documentation/io/testing.md
----------------------------------------------------------------------
diff --git a/src/documentation/io/testing.md b/src/documentation/io/testing.md
new file mode 100644
index 0000000..e43c628
--- /dev/null
+++ b/src/documentation/io/testing.md
@@ -0,0 +1,19 @@
+---
+layout: default
+title: "Testing I/O Transforms"
+permalink: /documentation/io/testing/
+---
+
+[Pipeline I/O Table of Contents]({{site.baseurl}}/documentation/io/io-toc/)
+
+# Testing I/O Transforms
+
+> Note: This guide is still in progress. There is an open issue to finish the 
guide: [BEAM-1025](https://issues.apache.org/jira/browse/BEAM-1025).
+
+
+# Next steps
+
+If you have a well tested I/O transform, why not contribute it to Apache Beam? 
Read all about it:
+
+[Contributing I/O Transforms]({{site.baseurl }}/documentation/io/contributing/)
+

http://git-wip-us.apache.org/repos/asf/beam-site/blob/f2171885/src/documentation/programming-guide.md
----------------------------------------------------------------------
diff --git a/src/documentation/programming-guide.md 
b/src/documentation/programming-guide.md
index 65a3062..57b49e8 100644
--- a/src/documentation/programming-guide.md
+++ b/src/documentation/programming-guide.md
@@ -921,9 +921,8 @@ While `ParDo` always produces a main output `PCollection` 
(as the return value f
 
 ## <a name="io"></a>Pipeline I/O
 
-When you create a pipeline, you often need to read data from some external 
source, such as a file in external data sink or a database. Likewise, you may 
want your pipeline to output its result data to a similar external data sink. 
Beam provides read and write transforms for a number of common data storage 
types. If you want your pipeline to read from or write to a data storage format 
that isn't supported by the built-in transforms, you can implement your own 
read and write transforms.
+When you create a pipeline, you often need to read data from some external 
source, such as a file in external data sink or a database. Likewise, you may 
want your pipeline to output its result data to a similar external data sink. 
Beam provides read and write transforms for a [number of common data storage 
types]({{site.baseurl }}/documentation/io/built-in/). If you want your pipeline 
to read from or write to a data storage format that isn't supported by the 
built-in transforms, you can [implement your own read and write 
transforms]({{site.baseurl }}/documentation/io/io-toc/).
 
-> A guide that covers how to implement your own Beam IO transforms is in 
progress ([BEAM-1025](https://issues.apache.org/jira/browse/BEAM-1025)).
 
 ### Reading input data
 
@@ -988,55 +987,8 @@ records.apply("WriteToText",
 %}
 ```
 
-### Beam-provided I/O APIs
-
-See the language specific source code directories for the Beam supported I/O 
APIs. Specific documentation for each of these I/O sources will be added in the 
future. ([BEAM-1054](https://issues.apache.org/jira/browse/BEAM-1054))
-
-<table class="table table-bordered">
-<tr>
-  <th>Language</th>
-  <th>File-based</th>
-  <th>Messaging</th>
-  <th>Database</th>
-</tr>
-<tr>
-  <td>Java</td>
-  <td>
-    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java";>AvroIO</a></p>
-    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/hdfs";>HDFS</a></p>
-    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java";>TextIO</a></p>
-    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/";>XML</a></p>
-  </td>
-  <td>
-    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/jms";>JMS</a></p>
-    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/kafka";>Kafka</a></p>
-    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/kinesis";>Kinesis</a></p>
-    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io";>Google
 Cloud PubSub</a></p>
-  </td>
-  <td>
-    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/hbase";>Apache 
HBase</a></p>
-    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/mongodb";>MongoDB</a></p>
-    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/jdbc";>JDBC</a></p>
-    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery";>Google
 BigQuery</a></p>
-    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable";>Google
 Cloud Bigtable</a></p>
-    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore";>Google
 Cloud Datastore</a></p>
-  </td>
-</tr>
-<tr>
-  <td>Python</td>
-  <td>
-    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/avroio.py";>avroio</a></p>
-    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/textio.py";>textio</a></p>
-  </td>
-  <td>
-  </td>
-  <td>
-    <p><a 
href="https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py";>Google
 BigQuery</a></p>
-    <p><a 
href="https://github.com/apache/beam/tree/master/sdks/python/apache_beam/io/gcp/datastore";>Google
 Cloud Datastore</a></p>
-  </td>
-
-</tr>
-</table>
+### Beam-provided I/O Transforms
+See the  [Beam-provided I/O Transforms]({{site.baseurl 
}}/documentation/io/built-in/) page for a list of the currently available I/O 
transforms.
 
 
 ## <a name="running"></a>Running the pipeline

http://git-wip-us.apache.org/repos/asf/beam-site/blob/f2171885/src/documentation/sdks/java.md
----------------------------------------------------------------------
diff --git a/src/documentation/sdks/java.md b/src/documentation/sdks/java.md
index 1a3d856..474dc93 100644
--- a/src/documentation/sdks/java.md
+++ b/src/documentation/sdks/java.md
@@ -21,22 +21,5 @@ See the [Java API Reference]({{ site.baseurl 
}}/documentation/sdks/javadoc/) for
 The Java SDK supports all features currently supported by the Beam model.
 
 
-## Supported IO Connectors
-
-* Amazon Kinesis
-* Apache Hadoop's `FileInputFormat` in Hadoop Distributed File System (HDFS)
-* Apache HBase
-* Apache Kafka
-* Avro Files
-* Google BigQuery
-* Google Cloud Bigtable
-* Google Cloud Datastore
-* Google Cloud Pub/Sub
-* Google Cloud Storage
-* Java Database Connectivity (JDBC)
-* Java Message Service (JMS)
-* MongoDB
-* Text Files
-* XML Files
-
-
+## Pipeline I/O
+See the [Beam-provided I/O Transforms]({{site.baseurl 
}}/documentation/io/built-in/) page for a list of the currently available I/O 
transforms.

Reply via email to