subject:"\"converting categorical values in csv file to numerical values\""

Re: converting categorical values in csv file to numerical values

2015-11-05 Thread tog

If you corpus is large (nlp) this is indeed the best solution otherwise (few words I.e. Categories) I guess you will end up with the same result On Friday, 6 November 2015, Balachandar R.A. wrote: > Hi Guillaume, > > > This is always an option. However, I read about HashingTF which exactly > do

Re: converting categorical values in csv file to numerical values

2015-11-05 Thread Balachandar R.A.

Hi Guillaume, This is always an option. However, I read about HashingTF which exactly does this quite efficiently and can scale too. Hence, looking for a solution using this technique. regards Bala On 5 November 2015 at 18:50, tog wrote: > Hi Bala > > Can't you do a simple dictionnary and m

Re: converting categorical values in csv file to numerical values

2015-11-05 Thread tog

Hi Bala Can't you do a simple dictionnary and map those values to numbers? Cheers Guillaume On 5 November 2015 at 09:54, Balachandar R.A. wrote: > HI > > > I am new to spark MLlib and machine learning. I have a csv file that > consists of around 100 thousand rows and 20 columns. Of these 20 co

converting categorical values in csv file to numerical values

2015-11-05 Thread Balachandar R.A.

HI I am new to spark MLlib and machine learning. I have a csv file that consists of around 100 thousand rows and 20 columns. Of these 20 columns, 10 contains string values. Each value in these columns are not necessarily unique. They are kind of categorical, that is, the values could be one amoun