Hi Dian,

Thank you for sharing your thoughts. What do you propose going forward? I
am not sure I got this from your email.

Best,

Konstantin



On Wed, Mar 23, 2022 at 10:03 AM Dian Fu <dian0511...@gmail.com> wrote:

> Hi Konstantin,
>
> Thanks a lot for bringing up this discussion.
>
> Currently, the Python documentation is more like a mixture of Option 1 and
> Option 2. It contains two parts:
> 1) The first part is the independent page [1] which could be seen as the
> main entrypoint for Python users.
> 2) The second part is the Python tabs which are among the DataStream API /
> Table API pages.
>
> The motivation to provide an independent page for Python documentation is
> as follows:
> 1) We are trying to create a Pythonic documentation for Python users (we
> are still far away from that and I have received much feedback saying that
> the Python documentation and API is too Java-like). However, to avoid
> duplication, it will link to the DataStream API / Table API pages when
> necessary instead of copying content. There are indeed exceptions, e.g. the
> window example given by Jark, that's because it only provides a very
> limited window support in Python DataStream API at present and to give
> Python users a complete picture of what they can do in Python DataStream
> API, we have added a dedicated page. We are trying to finalize the window
> support in 1.16 [2] and remove the duplicate documentation.
> 2) There are some kinds of documentations which are only applicable for
> Python language, e.g. dependency management[2], conversion between Table
> and Pandas DataFrame [3], etc. Providing an independent page helps to
> provide a place to hold all these kinds of documentation together.
>
> Regarding Option 1: "Language Tabs", this makes it hard to create Pythonic
> documentation for Python users.
> Regarding Option 2: "Language First", it may mean a lot of duplications.
> Currently, there are a lot of descriptions in the DataStream API / Table
> API pages which are shared between Java/Scala/Python.
>
> > In the rest of the documentation, Python is sometimes
> > included like in this Table API page [2] and sometimes ignored like on
> the
> > project setup pages [3].
> I agree that this is something that we need to improve.
>
> Regards,
> Dian
>
> [1]
>
> https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/python/overview/
> [2] https://issues.apache.org/jira/browse/FLINK-26477
> [2]
>
> https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/python/dependency_management/
> [3]
>
> https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/python/table/conversion_of_pandas/
>
> On Wed, Mar 23, 2022 at 4:17 PM Jark Wu <imj...@gmail.com> wrote:
>
> > Hi Konstantin,
> >
> > Thanks for starting this discussion.
> >
> > From my perspective, I prefer the "Language Tabs" approach.
> > But maybe we can improve the tabs to move to the sidebar or top menu,
> > which allows users to first decide on their language and then the API.
> > IMO, programming languages are just like spoken languages which can be
> > picked in the sidebar.
> > What I want to avoid is the duplicate docs and in-complete features in a
> > specific language.
> > "Language First" may confuse users about what is and where to find the
> > complete features provided by flink.
> >
> > For example, there are a lot of duplications in the "Window" pages[1] and
> > "Python Window" pages[2].
> > And users can't have a complete overview of Flink's window mechanism from
> > the Python API part.
> > Users have to go through the Java/Scala DataStream API first to build the
> > overall knowledge,
> > and then to read the Python API part.
> >
> > > * Second, most of the Flink Documentation currently is using a
> "Language
> > Tabs" approach, but this might become obsolete in the long-term anyway as
> > we move more and more in a Scala-free direction.
> >
> > The Scala-free direction means users can pick arbitrary Scala versions,
> not
> > drop the Scala API.
> > So the "Language Tabs" is still necessary and helpful for switching
> > languages.
> >
> > Best,
> > Jark
> >
> > [1]:
> >
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/python/datastream/operators/windows/
> > [2]:
> >
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/
> >
> >
> >
> >
> >
> >
> >
> > On Tue, 22 Mar 2022 at 21:40, Konstantin Knauf <kna...@apache.org>
> wrote:
> >
> > > Hi everyone,
> > >
> > > I would like to discuss a particular aspect of our documentation: the
> > > top-level structure with respect to languages and APIs. The current
> > > structure is inconsistent and the direction is unclear to me, which
> makes
> > > it hard for me to contribute gradual improvements.
> > >
> > > Currently, the Python documentation has its own independent branch in
> the
> > > documentation [1]. In the rest of the documentation, Python is
> sometimes
> > > included like in this Table API page [2] and sometimes ignored like on
> > the
> > > project setup pages [3]. Scala and Java on the other hand are always
> > > documented in parallel next to each other in tabs.
> > >
> > > The way I see it, most parts (application development, connectors,
> > getting
> > > started, project setup) of our documentation have two primary
> dimensions:
> > > API (DataStream, Table API), Language (Python, Java, Scala)
> > >
> > > In addition, there is SQL, for which the language is only a minor
> factor
> > > (UDFs), but which generally requires a different structure (different
> > > audience, different tools). On the other hand, SQL and Table API have
> > some
> > > conceptual overlap, whereas I doubt these concepts are of big interest
> > > to SQL users. So, to me SQL should be treated separately in any case
> with
> > > links to the Table API documentation for some concepts.
> > >
> > > I think, in general, both approaches can work:
> > >
> > >
> > > *Option 1: "Language Tabs"*
> > > Application Development
> > > > DataStream API  (Java, Scala, Python)
> > > > Table API (Java, Scala, Python)
> > > > SQL
> > >
> > >
> > > *Option 2: "Language First" *
> > > Java Development Guide
> > > > Getting Started
> > > > DataStream API
> > > > Table API
> > > Python Development Guide
> > > > Getting Started
> > > > Datastream API
> > > > Table API
> > > SQL Development Guide
> > >
> > > I don't have a strong opinion on this, but tend towards "Language
> First".
> > >
> > > * First, I assume, users actually first decide on their language/tools
> of
> > > choice and then move on to the API.
> > >
> > > * Second, most of the Flink Documentation currently is using a
> "Language
> > > Tabs" approach, but this might become obsolete in the long-term anyway
> as
> > > we move more and more in a Scala-free direction.
> > >
> > > For the connectors, I think, there is a good argument for "Language &
> API
> > > Embedded", because documenting every connector for each API and
> language
> > > separately would result in a lot of duplication. Here, I would go one
> > step
> > > further then what we have right now and target
> > >
> > > Connectors
> > > -> Kafka (All APIs incl. SQL, All Languages)
> > > -> Kinesis (same)
> > > -> ...
> > >
> > > This also results in a quick overview for users about which connectors
> > > exist and plays well with our plan of externalizing connectors.
> > >
> > > For completeness & scope of the discussion: there are two outdated
> FLIPs
> > on
> > > documentation (42, 60), which both have not been implemented, are
> > partially
> > > contradicting each other and are generally out-of-date. I specifically
> > > don't intend to add another FLIP to this graveyard, but still reach a
> > > consensus on the high-level direction.
> > >
> > > What do you think?
> > >
> > > Cheers,
> > >
> > > Konstantin
> > >
> > > --
> > >
> > > Konstantin Knauf
> > >
> > > https://twitter.com/snntrable
> > >
> > > https://github.com/knaufk
> > >
> >
>


-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk

Reply via email to