+1, it's great to have Pandas support in Spark out of the box. On Tue, Mar 16, 2021 at 10:12 PM Takeshi Yamamuro <linguin....@gmail.com> wrote:
> +1; the pandas interfaces are pretty popular and supporting them in > pyspark looks promising, I think. > one question I have; what's an initial goal of the proposal? > Is that to port all the pandas interfaces that Koalas has already > implemented? > Or, the basic set of them? > > On Tue, Mar 16, 2021 at 1:44 AM Ismaël Mejía <ieme...@gmail.com> wrote: > >> +1 >> >> Bringing a Pandas API for pyspark to upstream Spark will only bring >> benefits for everyone (more eyes to use/see/fix/improve the API) as >> well as better alignment with core Spark improvements, the extra >> weight looks manageable. >> >> On Mon, Mar 15, 2021 at 4:45 PM Nicholas Chammas >> <nicholas.cham...@gmail.com> wrote: >> > >> > On Mon, Mar 15, 2021 at 2:12 AM Reynold Xin <r...@databricks.com> >> wrote: >> >> >> >> I don't think we should deprecate existing APIs. >> > >> > >> > +1 >> > >> > I strongly prefer Spark's immutable DataFrame API to the Pandas API. I >> could be wrong, but I wager most people who have worked with both Spark and >> Pandas feel the same way. >> > >> > For the large community of current PySpark users, or users switching to >> PySpark from another Spark language API, it doesn't make sense to deprecate >> the current API, even by convention. >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> > > -- > --- > Takeshi Yamamuro >