morsapaes commented on a change in pull request #361:
URL: https://github.com/apache/flink-web/pull/361#discussion_r458595186
##########
File path: _posts/2020-07-21-catalogs.md
##########
@@ -35,7 +35,12 @@ Catalogs don’t have to be limited to the metadata of
datasets. You can usually
* **Queries** - Those can be useful when you don’t want to persist a data set,
but want to provide a recipe for creating it from other sources instead.
## Catalogs support in Flink SQL
-Starting from version 1.9, Flink has a set of Catalog APIs that allows to
integrate Flink with various catalog implementations. With the help of those
APIs, you can query tables in Flink that were created in your external catalogs
(e.g. Hive Metastore). Additionally, depending on the catalog implementation,
you can create new objects such as tables or views from Flink, reuse them
across different jobs, and possibly even use them in other tools compatible
with that catalog. As of Flink 1.11, there are two catalog implementations
supported by the community:
+Starting from version 1.9, Flink has a set of Catalog APIs that allows to
integrate Flink with various catalog implementations. With the help of those
APIs, you can query tables in Flink that were created in your external catalogs
(e.g. Hive Metastore). Additionally, depending on the catalog implementation,
you can create new objects such as tables or views from Flink, reuse them
across different jobs, and possibly even use them in other tools compatible
with that catalog. In other words you can see catalogs with two-fold purpose:
+
+ * Catalogs are sort of out-of-the box integration with an ecosystem such as
RDBMs or Hive, where you can query the external, towards Flink, tables, views,
or functions without additional connector configuration. The connector
properties are automatically derived from the Catalog itself.
+ * A persistent store for Flink specific metadata. In this mode we
additionally store connector properties alongside the logical metadata such as
a schema or a name. That approach let's you store a full definition of e.g. a
Kafka backed table with records serialized with Avro in Hive that can be later
on used by Flink. However, as it incorporates Flink specific properties it can
not be used by other tools that leverage Hive metastore.
Review comment:
```suggestion
Starting from version 1.9, Flink has a set of Catalog APIs that allows to
integrate Flink with various catalog implementations. With the help of those
APIs, you can query tables in Flink that were created in your external catalogs
(e.g. Hive Metastore). Additionally, depending on the catalog implementation,
you can create new objects such as tables or views from Flink, reuse them
across different jobs, and possibly even use them in other tools compatible
with that catalog. In other words, you can see catalogs as having a two-fold
purpose:
* Provide an out-of-the box integration with ecosystems such as RDBMSs or
Hive that allows you to query external objects like tables, views, or functions
with no additional connector configuration. The connector properties are
automatically derived from the catalog itself.
* Act as a persistent store for Flink-specific metadata. In this mode, we
additionally store connector properties alongside the logical metadata (e.g.
schema, object name). That approach enables you to, for example, store a full
definition of a Kafka-backed table with records serialized with Avro in Hive
that can be later on used by Flink. However, as it incorporates Flink-specific
properties, it can not be used by other tools that leverage Hive Metastore.
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]