Re: can we create dummy variables from categorical variables, using sparkR

2016-01-24 Thread Yanbo Liang
Hi Devesh,

RFormula will encode category variables(column of string type) as dummy
variables automatically. You do not need to do dummy transform explicitly
if you want to train machine learning model using SparkR. Although SparkR
only supports a limited ML algorithms(GLM) currently.

Thanks
Yanbo

2016-01-20 1:15 GMT+08:00 Vinayak Agrawal :

> Yes, you can use Rformula library. Please see
>
> https://databricks.com/blog/2015/10/05/generalized-linear-models-in-sparkr-and-r-formula-support-in-mllib.html
>
> On Tue, Jan 19, 2016 at 10:34 AM, Devesh Raj Singh  > wrote:
>
>> Hi,
>>
>> Can we create dummy variables for categorical variables in sparkR like we
>> do using "dummies" package in R
>>
>> --
>> Warm regards,
>> Devesh.
>>
>
>
>
> --
> Vinayak Agrawal
> Big Data Analytics
> IBM
>
> "To Strive, To Seek, To Find and Not to Yield!"
> ~Lord Alfred Tennyson
>


can we create dummy variables from categorical variables, using sparkR

2016-01-19 Thread Devesh Raj Singh
Hi,

Can we create dummy variables for categorical variables in sparkR like we
do using "dummies" package in R

-- 
Warm regards,
Devesh.


Re: can we create dummy variables from categorical variables, using sparkR

2016-01-19 Thread Vinayak Agrawal
Yes, you can use Rformula library. Please see
https://databricks.com/blog/2015/10/05/generalized-linear-models-in-sparkr-and-r-formula-support-in-mllib.html

On Tue, Jan 19, 2016 at 10:34 AM, Devesh Raj Singh 
wrote:

> Hi,
>
> Can we create dummy variables for categorical variables in sparkR like we
> do using "dummies" package in R
>
> --
> Warm regards,
> Devesh.
>



-- 
Vinayak Agrawal
Big Data Analytics
IBM

"To Strive, To Seek, To Find and Not to Yield!"
~Lord Alfred Tennyson