Hi Yan,

Basically the reason I was looking for the categorical datatype is as given
here <https://pandas.pydata.org/pandas-docs/stable/categorical.html>:
ability to fix column values to specific categories. Is it possible to
create a user defined data type which could do so?

Thanks and Regards,
Saatvik Shah

On Fri, Jun 16, 2017 at 1:42 AM, 颜发才(Yan Facai) <facai....@gmail.com> wrote:

> You can use some Transformers to handle categorical data,
> For example,
> StringIndexer encodes a string column of labels to a column of label
> indices:
> http://spark.apache.org/docs/latest/ml-features.html#stringindexer
>
>
> On Thu, Jun 15, 2017 at 10:19 PM, saatvikshah1994 <
> saatvikshah1...@gmail.com> wrote:
>
>> Hi,
>> I'm trying to convert a Pandas -> Spark dataframe. One of the columns I
>> have
>> is of the Category type in Pandas. But there does not seem to be support
>> for
>> this same type in Spark. What is the best alternative?
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/Best-alternative-for-Category-Type-in-
>> Spark-Dataframe-tp28764.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>


-- 
*Saatvik Shah,*
*1st  Year,*
*Masters in the School of Computer Science,*
*Carnegie Mellon University*

*https://saatvikshah1994.github.io/ <https://saatvikshah1994.github.io/>*

Reply via email to