Hi Wush,

I'm CC'ing user@spark.apache.org (which is the new list) and BCC'ing
u...@spark.incubator.apache.org.

In Spark 1.3, schemaRDD is in fact being renamed to DataFrame (see:
https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html
)

As for a "model.matrix", you might have a look at the new pipelines API in
spark 1.2 (to be further improved in 1.3) which provides facilities for
repeatable data transformation as input to ML algorithms. That said -
something to handle the case of automatically one-hot encoding all the
categorical variables in a DataFrame might be a welcome addition.

- Evan

On Thu, Mar 5, 2015 at 8:43 PM, Wush Wu <w...@bridgewell.com> wrote:

> Dear all,
>
> I am a new spark user from R.
>
> After exploring the schemaRDD, I notice that it is similar to data.frame.
> Is there a feature like `model.matrix` in R to convert schemaRDD to model
> matrix automatically according to the type without explicitly converting
> them one by one?
>
> Thanks,
> Wush
>
>
>
>
>
>

Reply via email to