It looks like it. "DataFrame UDFs in R" is resolved in Spark 2.0: https://issues.apache.org/jira/browse/SPARK-6817
Here's some of the code: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala /** * A function wrapper that applies the given R function to each partition. */ private[sql] case class MapPartitionsRWrapper( func: Array[Byte], packageNames: Array[Byte], broadcastVars: Array[Broadcast[Object]], inputSchema: StructType, outputSchema: StructType) extends (Iterator[Any] => Iterator[Any]) Xinh On Wed, Jun 29, 2016 at 2:59 PM, Sean Owen <so...@cloudera.com> wrote: > Here we (or certainly I) am not talking about R Server, but plain vanilla > R, as used with Spark and SparkR. Currently, SparkR doesn't distribute R > code at all (it used to, sort of), so I'm wondering if that is changing > back. > > On Wed, Jun 29, 2016 at 10:53 PM, John Aherne <john.ahe...@justenough.com> > wrote: > >> I don't think R server requires R on the executor nodes. I originally set >> up a SparkR cluster for our Data Scientist on Azure which required that I >> install R on each node, but for the R Server set up, there is an extra edge >> node with R server that they connect to. From what little research I was >> able to do, it seems that there are some special functions in R Server that >> can distribute the work to the cluster. >> >> Documentation is light, and hard to find but I found this helpful: >> >> https://blogs.msdn.microsoft.com/uk_faculty_connection/2016/05/10/r-server-for-hdinsight-running-on-microsoft-azure-cloud-data-science-challenges/ >> >> >> >> On Wed, Jun 29, 2016 at 3:29 PM, Sean Owen <so...@cloudera.com> wrote: >> >>> Oh, interesting: does this really mean the return of distributing R >>> code from driver to executors and running it remotely, or do I >>> misunderstand? this would require having R on the executor nodes like >>> it used to? >>> >>> On Wed, Jun 29, 2016 at 5:53 PM, Xinh Huynh <xinh.hu...@gmail.com> >>> wrote: >>> > There is some new SparkR functionality coming in Spark 2.0, such as >>> > "dapply". You could use SparkR to load a Parquet file and then run >>> "dapply" >>> > to apply a function to each partition of a DataFrame. >>> > >>> > Info about loading Parquet file: >>> > >>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/sparkr.html#from-data-sources >>> > >>> > API doc for "dapply": >>> > >>> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/api/R/index.html >>> > >>> > Xinh >>> > >>> > On Wed, Jun 29, 2016 at 6:54 AM, sujeet jog <sujeet....@gmail.com> >>> wrote: >>> >> >>> >> try Spark pipeRDD's , you can invoke the R script from pipe , push >>> the >>> >> stuff you want to do on the Rscript stdin, p >>> >> >>> >> >>> >> On Wed, Jun 29, 2016 at 7:10 PM, Gilad Landau < >>> gilad.lan...@clicktale.com> >>> >> wrote: >>> >>> >>> >>> Hello, >>> >>> >>> >>> >>> >>> >>> >>> I want to use R code as part of spark application (the same way I >>> would >>> >>> do with Scala/Python). I want to be able to run an R syntax as a map >>> >>> function on a big Spark dataframe loaded from a parquet file. >>> >>> >>> >>> Is this even possible or the only way to use R is as part of RStudio >>> >>> orchestration of our Spark cluster? >>> >>> >>> >>> >>> >>> >>> >>> Thanks for the help! >>> >>> >>> >>> >>> >>> >>> >>> Gilad >>> >>> >>> >>> >>> >> >>> >> >>> > >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>> >>> >> >> >> -- >> >> John Aherne >> Big Data and SQL Developer >> >> [image: JustEnough Logo] >> >> Cell: >> Email: >> Skype: >> Web: >> >> +1 (303) 809-9718 >> john.ahe...@justenough.com >> john.aherne.je >> www.justenough.com >> >> >> Confidentiality Note: The information contained in this email and >> document(s) attached are for the exclusive use of the addressee and may >> contain confidential, privileged and non-disclosable information. If the >> recipient of this email is not the addressee, such recipient is strictly >> prohibited from reading, photocopying, distribution or otherwise using this >> email or its contents in any way. >> >> >