Hi Bala

Can't you do a simple dictionnary and map those values to numbers?

Cheers
Guillaume

On 5 November 2015 at 09:54, Balachandar R.A. <balachandar...@gmail.com>
wrote:

> HI
>
>
> I am new to spark MLlib and machine learning. I have a csv file that
> consists of around 100 thousand rows and 20 columns. Of these 20 columns,
> 10 contains string values. Each value in these columns are not necessarily
> unique. They are kind of categorical, that is, the values could be one
> amount, say 10 values. To start with, I could run examples, especially,
> random forest algorithm in my local spark (1.5.1.) platform. However, I
> have a challenge with my dataset due to these strings as the APIs takes
> numerical values. Can any one tell me how I can map these categorical
> values (strings) into numbers and use them with random forest algorithms?
> Any example will be greatly appreciated.
>
>
> regards
>
> Bala
>



-- 
PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net

Reply via email to