Re: [DISCUSS] FLIP-120: Support conversion between PyFlink Table and Pandas DataFrame

2020-04-06 Thread Dian Fu
Thanks you all for the discussion. It seems that we have reached consensus on the design. I will start a VOTE thread if there are no other feedbacks. Regards, Dian > 在 2020年4月3日,下午2:58,Wei Zhong 写道: > > Hi Dian, > > Thanks for driving this. Big +1 for supporting from/to pandas in PyFlink! >

Re: [DISCUSS] FLIP-120: Support conversion between PyFlink Table and Pandas DataFrame

2020-04-03 Thread Wei Zhong
Hi Dian, Thanks for driving this. Big +1 for supporting from/to pandas in PyFlink! Best, Wei > 在 2020年4月3日,13:46,jincheng sun 写道: > > +1, Thanks for bring up this discussion @Dian Fu > > Best, > Jincheng > > > Jeff Zhang 于2020年4月1日周三 下午1:27写道: > >> Thanks for the reply, Dian, that make

Re: [DISCUSS] FLIP-120: Support conversion between PyFlink Table and Pandas DataFrame

2020-04-02 Thread jincheng sun
+1, Thanks for bring up this discussion @Dian Fu Best, Jincheng Jeff Zhang 于2020年4月1日周三 下午1:27写道: > Thanks for the reply, Dian, that make sense to me. > > Dian Fu 于2020年4月1日周三 上午11:53写道: > > > Hi Jeff, > > > > Thanks for your feedback. > > > > ArrowTableSink is a Flink sink which is

Re: [DISCUSS] FLIP-120: Support conversion between PyFlink Table and Pandas DataFrame

2020-03-31 Thread Jeff Zhang
Thanks for the reply, Dian, that make sense to me. Dian Fu 于2020年4月1日周三 上午11:53写道: > Hi Jeff, > > Thanks for your feedback. > > ArrowTableSink is a Flink sink which is responsible for collecting the > data of the table. It will serialize the data of the table to Arrow format > to make sure that

Re: [DISCUSS] FLIP-120: Support conversion between PyFlink Table and Pandas DataFrame

2020-03-31 Thread Dian Fu
Hi Jeff, Thanks for your feedback. ArrowTableSink is a Flink sink which is responsible for collecting the data of the table. It will serialize the data of the table to Arrow format to make sure that it could be deserialized to pandas dataframe efficiently. You are right that pandas dataframe

Re: [DISCUSS] FLIP-120: Support conversion between PyFlink Table and Pandas DataFrame

2020-03-31 Thread Jeff Zhang
Thanks Dian for driving this, definitely +1 Here's my 2 cents: 1. I would pay more attention on to_pandas than from_pandas. Because to_pandas will be used more frequently I believe 2. I think ArrowTableSink may not be enough for to_pandas, because pandas dataframe is on client side, it is not a

[DISCUSS] FLIP-120: Support conversion between PyFlink Table and Pandas DataFrame

2020-03-31 Thread Dian Fu
Hi everyone, I'd like to start a discussion about supporting conversion between PyFlink Table and Pandas DataFrame. Pandas dataframe is the de-facto standard to work with tabular data in Python community. PyFlink table is Flink’s representation of the tabular data in Python language. It would