Re: Support R in Spark

oppokui Fri, 19 Sep 2014 07:55:24 -0700

Thanks, Shivaram. 

Kui


> On Sep 19, 2014, at 12:58 AM, Shivaram Venkataraman 
> <shiva...@eecs.berkeley.edu> wrote:
> 
> As R is single-threaded, SparkR launches one R process per-executor on
> the worker side.
> 
> Thanks
> Shivaram
> 
> On Thu, Sep 18, 2014 at 7:49 AM, oppokui <oppo...@gmail.com> wrote:
>> Shivaram,
>> 
>> As I know, SparkR used rJava package. In work node, spark code will execute 
>> R code by launching R process and send/receive byte array.
>> I have a question on when to launch R process. R process is per Work 
>> process, or per executor thread, or per each RDD processing?
>> 
>> Thanks and Regards.
>> 
>> Kui
>> 
>>> On Sep 6, 2014, at 5:53 PM, oppokui <oppo...@gmail.com> wrote:
>>> 
>>> Cool! It is a very good news. Can’t wait for it.
>>> 
>>> Kui
>>> 
>>>> On Sep 5, 2014, at 1:58 AM, Shivaram Venkataraman 
>>>> <shiva...@eecs.berkeley.edu> wrote:
>>>> 
>>>> Thanks Kui. SparkR is a pretty young project, but there are a bunch of
>>>> things we are working on. One of the main features is to expose a data
>>>> frame API (https://sparkr.atlassian.net/browse/SPARKR-1) and we will
>>>> be integrating this with Spark's MLLib.  At a high-level this will
>>>> allow R users to use a familiar API but make use of MLLib's efficient
>>>> distributed implementation. This is the same strategy used in Python
>>>> as well.
>>>> 
>>>> Also we do hope to merge SparkR with mainline Spark -- we have a few
>>>> features to complete before that and plan to shoot for integration by
>>>> Spark 1.3.
>>>> 
>>>> Thanks
>>>> Shivaram
>>>> 
>>>> On Wed, Sep 3, 2014 at 9:24 PM, oppokui <oppo...@gmail.com> wrote:
>>>>> Thanks, Shivaram.
>>>>> 
>>>>> No specific use case yet. We try to use R in our project as data scientest
>>>>> are all knowing R. We had a concern that how R handles the mass data. 
>>>>> Spark
>>>>> does a better work on big data area, and Spark ML is focusing on 
>>>>> predictive
>>>>> analysis area. Then we are thinking whether we can merge R and Spark
>>>>> together. We tried SparkR and it is pretty easy to use. But we didn’t see
>>>>> any feedback on this package in industry. It will be better if Spark team
>>>>> has R support just like scala/Java/Python.
>>>>> 
>>>>> Another question is that MLlib will re-implement all famous data mining
>>>>> algorithms in Spark, then what is the purpose of using R?
>>>>> 
>>>>> There is another technique for us H2O which support R natively. H2O is 
>>>>> more
>>>>> friendly to data scientist. I saw H2O can also work on Spark (Sparkling
>>>>> Water).  It is better than using SparkR?
>>>>> 
>>>>> Thanks and Regards.
>>>>> 
>>>>> Kui
>>>>> 
>>>>> 
>>>>> On Sep 4, 2014, at 1:47 AM, Shivaram Venkataraman
>>>>> <shiva...@eecs.berkeley.edu> wrote:
>>>>> 
>>>>> Hi
>>>>> 
>>>>> Do you have a specific use-case where SparkR doesn't work well ? We'd love
>>>>> to hear more about use-cases and features that can be improved with 
>>>>> SparkR.
>>>>> 
>>>>> Thanks
>>>>> Shivaram
>>>>> 
>>>>> 
>>>>> On Wed, Sep 3, 2014 at 3:19 AM, oppokui <oppo...@gmail.com> wrote:
>>>>>> 
>>>>>> Does spark ML team have plan to support R script natively? There is a
>>>>>> SparkR project, but not from spark team. Spark ML used netlib-java to 
>>>>>> talk
>>>>>> with native fortran routines or use NumPy, why not try to use R in some
>>>>>> sense.
>>>>>> 
>>>>>> R had lot of useful packages. If spark ML team can include R support, it
>>>>>> will be a very powerful.
>>>>>> 
>>>>>> Any comment?
>>>>>> 
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>> 
>>>>> 
>>>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Support R in Spark

Reply via email to