Hi all!

I think that all the opinions and ideas are not actually in conflict, so
let me summarize what I understand is the proposal:

*(1) Long-term goal: Full Python Table API with UDFs*

     To break the implementation effort up into stages, the first step
would be the API without UDFs.
      Because of all the built-in functions in the Table API, this can
already exist by itself, with some value, but ultimately is quite limited
without UDF support.

     ==> The FLIP should probably reflect the full goal rather than the
first implementation step only, this would make sure everyone understands
what the final goal of the effort is.


*(2) Relationship to Beam Language Portability*

Flink's own Python Table API and Beam-Python on Flink add different value
and are both attractive for different scenarios.

  - Beam's Python API supports complex pipelines in a similar style as the
DataStream API. There is also the ecosystem of libraries built on top that
DSL, for example for machine learning.

  - Flink's Python Table API builds mostly relational expressions, plus
some UDFs. Most of the Python code never executes in Python, though. It is
geared at use cases similar to Flink's Table API.

Both approaches mainly differ in how the streaming DAG is built from Python
code and received by the JVM.

In previous discussions, we concluded that for inter process data exchange
(JVM <> Python), we want to share code with Beam.
That part is possibly the most crucial piece to getting performance out of
the Python DSL, so will benefit from sharing development, optimizations,
etc.

Best,
Stephan




On Fri, Apr 5, 2019 at 5:25 PM jincheng sun <sunjincheng...@gmail.com>
wrote:

> One more thing It's better to mention that Flink table API is a superset of
> Flink SQL, such as:
> - AddColumns/DropColums/RenameColumns, the detail can be found in Google
> doc
> <
> https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit#heading=h.7rwcjbvr52dc
> >
> - Interactive Programming in Flink Table API, the detail can be found in
> FLIP-36
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-36%3A+Support+Interactive+Programming+in+Flink
> >
> I think In the future, more and more features that cannot be expressed in
> SQL will be added in Table API.
>
> Thomas Weise <thomas.we...@gmail.com> 于2019年4月5日周五 下午12:11写道:
>
> > Hi Jincheng,
> >
> > >
> > > Yes, we can add use case examples in both google doc and FLIP, I had
> > > already add the simple usage in the google doc, here I want to know
> which
> > > kind of examples you want? :)
> > >
> >
> > Do you have use cases where the Python table API can be applied without
> UDF
> > support?
> >
> > (And where the same could not be accomplished with just SQL.)
> >
> >
> > > The very short answer to UDF support is Yes. As you said, we need UDF
> > > support on the Python Table API, including (UDF, UDTF, UDAF). This
> needs
> > to
> > > be discussed after basic Python TableAPI supported. Because UDF
> involves
> > > the management of the python environment, Runtime level Java and
> Runtime
> > > communication, and UDAF in Flink also involves the application of
> State,
> > so
> > > this is a topic that is worth discussing in depth in a separate thread.
> > >
> >
> > The current proposal for job submission touches something that Beam
> > portability already had to solve.
> >
> > If we think that the Python table API will only be useful with UDF
> support
> > (question above), then it may be better to discuss the first step with
> the
> > final goal in mind. If we find that Beam can be used for the UDF part
> then
> > approach 1 vs. approach 2 in the doc (for the client side language
> > boundary) may look different.
> >
> >
> > >
> > > I think that no matter how the Flink and Beam work together on the UDF
> > > level, it will not affect the current Python API (interface), we can
> > first
> > > support the Python API in Flink. Then start the UDX (UDF/UDTF/UDAF)
> > > support.
> > >
> > >
> > I agree that the client side API should not be affected.
> >
> >
> > > And great thanks for your valuable comments in Google doc! I will
> > feedback
> > > you in the google doc. :)
> > >
> > >
> > > Regards,
> > > Jincheng
> > >
> > > Thomas Weise <t...@apache.org> 于2019年4月4日周四 上午8:03写道:
> > >
> > > > Thanks for putting this proposal together.
> > > >
> > > > It would be nice, if you could share a few use case examples (maybe
> add
> > > > them as section to the FLIP?).
> > > >
> > > > The reason I ask: The table API is immensely useful, but it isn't
> clear
> > > to
> > > > me what value other language bindings provide without UDF support.
> With
> > > > FLIP-38 it will be possible to write a program in Python, but not
> > execute
> > > > Python functions. Without UDF support, isn't it possible to achieve
> > > roughly
> > > > the same with plain SQL? In which situation would I use the Python
> API?
> > > >
> > > > There was related discussion regarding UDF support in [1]. If the
> > > > assumption is that such support will be added later, then I would
> like
> > to
> > > > circle back to the question why this cannot be built on top of Beam?
> It
> > > > would be nice to clarify the bigger goal before embarking for the
> first
> > > > milestone.
> > > >
> > > > I'm going to comment on other things in the doc.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/f6f8116b4b38b0b2d70ed45b990d6bb1bcb33611fde6fdf32ec0e840@%3Cdev.flink.apache.org%3E
> > > >
> > > > Thomas
> > > >
> > > >
> > > > On Wed, Apr 3, 2019 at 12:35 PM Shuyi Chen <suez1...@gmail.com>
> wrote:
> > > >
> > > > > Thanks a lot for driving the FLIP, jincheng. The approach looks
> > > > > good. Adding multi-lang support sounds a promising direction to
> > expand
> > > > the
> > > > > footprint of Flink. Do we have plan for adding Golang support? As
> > many
> > > > > backend engineers nowadays are familiar with Go, but probably not
> > Java
> > > as
> > > > > much, adding Golang support would significantly reduce their
> friction
> > > to
> > > > > use Flink. Also, do we have a design for multi-lang UDF support,
> and
> > > > what's
> > > > > timeline for adding DataStream API support? We would like to help
> and
> > > > > contribute as well as we do have similar need internally at our
> > > company.
> > > > > Thanks a lot.
> > > > >
> > > > > Shuyi
> > > > >
> > > > > On Tue, Apr 2, 2019 at 1:03 AM jincheng sun <
> > sunjincheng...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > > As Xianda brought up in the previous email, There are a large
> > number
> > > of
> > > > > > data analysis users who want flink to support Python. At the
> Flink
> > > API
> > > > > > level, we have DataStreamAPI/DataSetAPI/TableAPI&SQL, the Table
> API
> > > > will
> > > > > > become the first-class citizen. Table API is declarative and can
> be
> > > > > > automatically optimized, which is mentioned in the Flink mid-term
> > > > roadmap
> > > > > > by Stephan. So we first considering supporting Python at the
> Table
> > > > level
> > > > > to
> > > > > > cater to the current large number of analytics users. For further
> > > > promote
> > > > > > Python support in flink table level. Dian, Wei and I discussed
> > > offline
> > > > a
> > > > > > bit and came up with an initial features outline as follows:
> > > > > >
> > > > > > - Python TableAPI Interface
> > > > > >   Introduce a set of Python Table API interfaces, including
> > interface
> > > > > > definitions such as Table, TableEnvironment, TableConfig, etc.
> > > > > >
> > > > > > - Implementation Architecture
> > > > > >   We will offer two alternative architecture options, one for
> pure
> > > > Python
> > > > > > language support and one for extended multi-language design.
> > > > > >
> > > > > > - Job Submission
> > > > > >   Provide a way that can submit(local/remote) Python Table API
> > jobs.
> > > > > >
> > > > > > - Python Shell
> > > > > >   Python Shell is to provide an interactive way for users to
> write
> > > and
> > > > > > execute flink Python Table API jobs.
> > > > > >
> > > > > >
> > > > > > The design document for FLIP-38 can be found here:
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1ybYt-0xWRMa1Yf5VsuqGRtOfJBz4p74ZmDxZYg3j_h8/edit?usp=sharing
> > > > > >
> > > > > > I am looking forward to your comments and feedback.
> > > > > >
> > > > > > Best,
> > > > > > Jincheng
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to