Oh, interesting: does this really mean the return of distributing R code from driver to executors and running it remotely, or do I misunderstand? this would require having R on the executor nodes like it used to?
On Wed, Jun 29, 2016 at 5:53 PM, Xinh Huynh <xinh.hu...@gmail.com> wrote: > There is some new SparkR functionality coming in Spark 2.0, such as > "dapply". You could use SparkR to load a Parquet file and then run "dapply" > to apply a function to each partition of a DataFrame. > > Info about loading Parquet file: > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/sparkr.html#from-data-sources > > API doc for "dapply": > http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/api/R/index.html > > Xinh > > On Wed, Jun 29, 2016 at 6:54 AM, sujeet jog <sujeet....@gmail.com> wrote: >> >> try Spark pipeRDD's , you can invoke the R script from pipe , push the >> stuff you want to do on the Rscript stdin, p >> >> >> On Wed, Jun 29, 2016 at 7:10 PM, Gilad Landau <gilad.lan...@clicktale.com> >> wrote: >>> >>> Hello, >>> >>> >>> >>> I want to use R code as part of spark application (the same way I would >>> do with Scala/Python). I want to be able to run an R syntax as a map >>> function on a big Spark dataframe loaded from a parquet file. >>> >>> Is this even possible or the only way to use R is as part of RStudio >>> orchestration of our Spark cluster? >>> >>> >>> >>> Thanks for the help! >>> >>> >>> >>> Gilad >>> >>> >> >> > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org