[FLINK-6660] [docs] Expand the connectors overview page This closes #3964.
Project: http://git-wip-us.apache.org/repos/asf/flink/repo Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/ce685dbd Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/ce685dbd Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/ce685dbd Branch: refs/heads/release-1.3 Commit: ce685dbdae011b6220934836339b0a0130929ba4 Parents: 0ae98d3 Author: David Anderson <[email protected]> Authored: Mon May 22 17:39:34 2017 +0200 Committer: Tzu-Li (Gordon) Tai <[email protected]> Committed: Tue May 23 22:27:10 2017 +0800 ---------------------------------------------------------------------- docs/dev/connectors/filesystem_sink.md | 2 +- docs/dev/connectors/index.md | 56 +++++++++++++++++++++-------- docs/dev/connectors/twitter.md | 2 +- 3 files changed, 44 insertions(+), 16 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/flink/blob/ce685dbd/docs/dev/connectors/filesystem_sink.md ---------------------------------------------------------------------- diff --git a/docs/dev/connectors/filesystem_sink.md b/docs/dev/connectors/filesystem_sink.md index d12752f..2d48876 100644 --- a/docs/dev/connectors/filesystem_sink.md +++ b/docs/dev/connectors/filesystem_sink.md @@ -24,7 +24,7 @@ under the License. --> This connector provides a Sink that writes partitioned files to any filesystem supported by -Hadoop FileSystem. To use this connector, add the +[Hadoop FileSystem](http://hadoop.apache.org). To use this connector, add the following dependency to your project: {% highlight xml %} http://git-wip-us.apache.org/repos/asf/flink/blob/ce685dbd/docs/dev/connectors/index.md ---------------------------------------------------------------------- diff --git a/docs/dev/connectors/index.md b/docs/dev/connectors/index.md index f5c3eec..ff76aee 100644 --- a/docs/dev/connectors/index.md +++ b/docs/dev/connectors/index.md @@ -25,22 +25,50 @@ specific language governing permissions and limitations under the License. --> -Connectors provide code for interfacing with various third-party systems. +* toc +{:toc} -Currently these systems are supported: (Please select the respective documentation page from the navigation on the left.) +## Predefined Sources and Sinks - * [Apache Kafka](https://kafka.apache.org/) (sink/source) - * [Elasticsearch](https://elastic.co/) (sink) - * [Hadoop FileSystem](http://hadoop.apache.org) (sink) - * [RabbitMQ](http://www.rabbitmq.com/) (sink/source) - * [Amazon Kinesis Streams](http://aws.amazon.com/kinesis/streams/) (sink/source) - * [Twitter Streaming API](https://dev.twitter.com/docs/streaming-apis) (source) - * [Apache NiFi](https://nifi.apache.org) (sink/source) - * [Apache Cassandra](https://cassandra.apache.org/) (sink) +A few basic data sources and sinks are built into Flink and are always available. +The [predefined data sources]({{ site.baseurll }}/dev/datastream_api.html#data-sources) include reading from files, directories, and sockets, and +ingesting data from collections and iterators. +The [predefined data sinks]({{ site.baseurl }}/dev/datastream_api.html#data-sinks) support writing to files, to stdout and stderr, and to sockets. +## Bundled Connectors +Connectors provide code for interfacing with various third-party systems. Currently these systems are supported: -To run an application using one of these connectors, additional third party -components are usually required to be installed and launched, e.g. the servers -for the message queues. Further instructions for these can be found in the -corresponding subsections. + * [Apache Kafka](kafka.html) (sink/source) + * [Apache Cassandra](cassandra.html) (sink) + * [Amazon Kinesis Streams](kinesis.html) (sink/source) + * [Elasticsearch](elasticsearch.html) (sink) + * [Hadoop FileSystem](filesystem_sink.html) (sink) + * [RabbitMQ](rabbitmq.html) (sink/source) + * [Apache NiFi](nifi.html) (sink/source) + * [Twitter Streaming API](twitter.html) (source) + +Keep in mind that to use one of these connectors in an application, additional third party +components are usually required, e.g. servers for the data stores or message queues. +Note also that while the streaming connectors listed in this section are part of the +Flink project and are included in source releases, they are not included in the binary distributions. +Further instructions can be found in the corresponding subsections. + +## Other Ways to Connect to Flink + +### Data Enrichment via Async I/O + +Using a connector isn't the only way to get data in and out of Flink. +One common pattern is to query an external database or web service in a `Map` or `FlatMap` +in order to enrich the primary datastream. +Flink offers an API for [Asynchronous I/O]({{ site.baseurl }}/dev/stream/asyncio.html) +to make it easier to do this kind of enrichment efficiently and robustly. + +### Queryable State + +When a Flink application pushes a lot of data to an external data store, this +can become an I/O bottleneck. +If the data involved has many fewer reads than writes, a better approach can be +for an external application to pull from Flink the data it needs. +The [Queryable State]({{ site.baseurl }}/dev/stream/queryable_state.html) interface +enables this by allowing the state being managed by Flink to be queried on demand. http://git-wip-us.apache.org/repos/asf/flink/blob/ce685dbd/docs/dev/connectors/twitter.md ---------------------------------------------------------------------- diff --git a/docs/dev/connectors/twitter.md b/docs/dev/connectors/twitter.md index 5fb7d68..0cded6a 100644 --- a/docs/dev/connectors/twitter.md +++ b/docs/dev/connectors/twitter.md @@ -23,7 +23,7 @@ specific language governing permissions and limitations under the License. --> -The Twitter Streaming API provides access to the stream of tweets made available by Twitter. +The [Twitter Streaming API](https://dev.twitter.com/docs/streaming-apis) provides access to the stream of tweets made available by Twitter. Flink Streaming comes with a built-in `TwitterSource` class for establishing a connection to this stream. To use this connector, add the following dependency to your project:
