Hi everyone,

I would like to discuss a particular aspect of our documentation: the
top-level structure with respect to languages and APIs. The current
structure is inconsistent and the direction is unclear to me, which makes
it hard for me to contribute gradual improvements.

Currently, the Python documentation has its own independent branch in the
documentation [1]. In the rest of the documentation, Python is sometimes
included like in this Table API page [2] and sometimes ignored like on the
project setup pages [3]. Scala and Java on the other hand are always
documented in parallel next to each other in tabs.

The way I see it, most parts (application development, connectors, getting
started, project setup) of our documentation have two primary dimensions:
API (DataStream, Table API), Language (Python, Java, Scala)

In addition, there is SQL, for which the language is only a minor factor
(UDFs), but which generally requires a different structure (different
audience, different tools). On the other hand, SQL and Table API have some
conceptual overlap, whereas I doubt these concepts are of big interest
to SQL users. So, to me SQL should be treated separately in any case with
links to the Table API documentation for some concepts.

I think, in general, both approaches can work:


*Option 1: "Language Tabs"*
Application Development
> DataStream API  (Java, Scala, Python)
> Table API (Java, Scala, Python)
> SQL


*Option 2: "Language First" *
Java Development Guide
> Getting Started
> DataStream API
> Table API
Python Development Guide
> Getting Started
> Datastream API
> Table API
SQL Development Guide

I don't have a strong opinion on this, but tend towards "Language First".

* First, I assume, users actually first decide on their language/tools of
choice and then move on to the API.

* Second, most of the Flink Documentation currently is using a "Language
Tabs" approach, but this might become obsolete in the long-term anyway as
we move more and more in a Scala-free direction.

For the connectors, I think, there is a good argument for "Language & API
Embedded", because documenting every connector for each API and language
separately would result in a lot of duplication. Here, I would go one step
further then what we have right now and target

Connectors
-> Kafka (All APIs incl. SQL, All Languages)
-> Kinesis (same)
-> ...

This also results in a quick overview for users about which connectors
exist and plays well with our plan of externalizing connectors.

For completeness & scope of the discussion: there are two outdated FLIPs on
documentation (42, 60), which both have not been implemented, are partially
contradicting each other and are generally out-of-date. I specifically
don't intend to add another FLIP to this graveyard, but still reach a
consensus on the high-level direction.

What do you think?

Cheers,

Konstantin

-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk

Reply via email to