[GitHub] [flink-web] MarkSfik commented on a change in pull request #361: Catalogs blogpost

GitBox Mon, 20 Jul 2020 08:42:03 -0700


MarkSfik commented on a change in pull request #361:
URL: https://github.com/apache/flink-web/pull/361#discussion_r457505908




##########
File path: _posts/2020-07-21-catalogs.md
##########
@@ -0,0 +1,178 @@
+---
+layout: post
+title: "Sharing is caring - Catalogs in Flink SQL"
+date: 2020-07-21T08:00:00.000Z
+categories: news
+authors:
+- dawid:
+  name: "Dawid Wysakowicz"
+  twitter: "dwysakowicz"
+---
+
+It's not a surprise that, in an era of digitalization, data is the most 
valuable asset in many companies: it's always the base for and product of any 
analysis or business logic. With an ever growing number of people working with 
data, it's a common practice for companies to build self-service platforms with 
the goal of democratising its access across different teams and — especially — 
to enable users from any background to be independent in their data needs. In 
such environments, metadata management becomes a crucial aspect. Without it, 
users often work blindly, spending too much time searching for datasets and 
their location, figuring out data formats and similar cumbersome tasks.
+
+It is a common practice for companies to start building a data platform with a 
metastore, catalog, or schema registries of some sort in place. Those let you 
clearly separate making the data available from consuming it. That separation 
has a few benefits:
+* improved productivity - The most obvious one. Making data reusable and 
shifting the focus on building new models/pipelines rather than data cleansing 
and discovery.
+* security - You can control the access to certain features of the data. For 
example, you can make the schema of dataset publicly available, but limit the 
actual access to the underlying data to only particular teams.

Review comment:
       ```suggestion
   * security - You can control the access to certain features of the data. For 
example, you can make the schema of the dataset publicly available, but limit 
the actual access to the underlying data only to particular teams.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink-web] MarkSfik commented on a change in pull request #361: Catalogs blogpost

Reply via email to