Hi Wush, I'm CC'ing user@spark.apache.org (which is the new list) and BCC'ing u...@spark.incubator.apache.org.
In Spark 1.3, schemaRDD is in fact being renamed to DataFrame (see: https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html ) As for a "model.matrix", you might have a look at the new pipelines API in spark 1.2 (to be further improved in 1.3) which provides facilities for repeatable data transformation as input to ML algorithms. That said - something to handle the case of automatically one-hot encoding all the categorical variables in a DataFrame might be a welcome addition. - Evan On Thu, Mar 5, 2015 at 8:43 PM, Wush Wu <w...@bridgewell.com> wrote: > Dear all, > > I am a new spark user from R. > > After exploring the schemaRDD, I notice that it is similar to data.frame. > Is there a feature like `model.matrix` in R to convert schemaRDD to model > matrix automatically according to the type without explicitly converting > them one by one? > > Thanks, > Wush > > > > > >