Oh, interesting: does this really mean the return of distributing R
code from driver to executors and running it remotely, or do I
misunderstand? this would require having R on the executor nodes like
it used to?

On Wed, Jun 29, 2016 at 5:53 PM, Xinh Huynh <xinh.hu...@gmail.com> wrote:
> There is some new SparkR functionality coming in Spark 2.0, such as
> "dapply". You could use SparkR to load a Parquet file and then run "dapply"
> to apply a function to each partition of a DataFrame.
>
> Info about loading Parquet file:
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/sparkr.html#from-data-sources
>
> API doc for "dapply":
> http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc1-docs/api/R/index.html
>
> Xinh
>
> On Wed, Jun 29, 2016 at 6:54 AM, sujeet jog <sujeet....@gmail.com> wrote:
>>
>> try Spark pipeRDD's , you can invoke the R script from pipe , push  the
>> stuff you want to do on the Rscript stdin,  p
>>
>>
>> On Wed, Jun 29, 2016 at 7:10 PM, Gilad Landau <gilad.lan...@clicktale.com>
>> wrote:
>>>
>>> Hello,
>>>
>>>
>>>
>>> I want to use R code as part of spark application (the same way I would
>>> do with Scala/Python).  I want to be able to run an R syntax as a map
>>> function on a big Spark dataframe loaded from a parquet file.
>>>
>>> Is this even possible or the only way to use R is as part of RStudio
>>> orchestration of our Spark  cluster?
>>>
>>>
>>>
>>> Thanks for the help!
>>>
>>>
>>>
>>> Gilad
>>>
>>>
>>
>>
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to