Hi Yan, Basically the reason I was looking for the categorical datatype is as given here <https://pandas.pydata.org/pandas-docs/stable/categorical.html>: ability to fix column values to specific categories. Is it possible to create a user defined data type which could do so?
Thanks and Regards, Saatvik Shah On Fri, Jun 16, 2017 at 1:42 AM, 颜发才(Yan Facai) <facai....@gmail.com> wrote: > You can use some Transformers to handle categorical data, > For example, > StringIndexer encodes a string column of labels to a column of label > indices: > http://spark.apache.org/docs/latest/ml-features.html#stringindexer > > > On Thu, Jun 15, 2017 at 10:19 PM, saatvikshah1994 < > saatvikshah1...@gmail.com> wrote: > >> Hi, >> I'm trying to convert a Pandas -> Spark dataframe. One of the columns I >> have >> is of the Category type in Pandas. But there does not seem to be support >> for >> this same type in Spark. What is the best alternative? >> >> >> >> -- >> View this message in context: http://apache-spark-user-list. >> 1001560.n3.nabble.com/Best-alternative-for-Category-Type-in- >> Spark-Dataframe-tp28764.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> > -- *Saatvik Shah,* *1st Year,* *Masters in the School of Computer Science,* *Carnegie Mellon University* *https://saatvikshah1994.github.io/ <https://saatvikshah1994.github.io/>*