Dian Fu created FLINK-17146:
-------------------------------
Summary: Support conversion between PyFlink Table and Pandas
DataFrame
Key: FLINK-17146
URL: https://issues.apache.org/jira/browse/FLINK-17146
Project: Flink
Issue Type: New Feature
Components: API / Python
Reporter: Dian Fu
Assignee: Dian Fu
Pandas dataframe is the de-facto standard to work with tabular data in Python
community. PyFlink table is Flink’s representation of the tabular data in
Python language. It would be nice to provide the ability to convert between the
PyFlink table and Pandas dataframe in PyFlink Table API which has the following
benefits:
* It provides users the ability to switch between PyFlink and Pandas
seamlessly when processing data in Python language. Users could process data
using one execution engine and switch to another seamlessly. For example, it
may happen that users have already got a Pandas dataframe at hand and want to
perform some expensive transformation of it. Then they could convert it to a
PyFlink table and leverage the power of Flink engine. Users could also convert
a PyFlink table to Pandas dataframe and perform transformation of it with the
rich functionalities provided by the Pandas ecosystem.
* No intermediate connectors are needed when converting between them.
More details could be found in
[FLIP-120|https://cwiki.apache.org/confluence/display/FLINK/FLIP-120%3A+Support+conversion+between+PyFlink+Table+and+Pandas+DataFrame].
--
This message was sent by Atlassian Jira
(v8.3.4#803005)