Thanks, Christopher. I saw it before, it is amazing. Last time I try to download it from adatao, but no response after filling the table. How can I download it or its source code? What is the license?
Kui > On Sep 6, 2014, at 8:08 PM, Christopher Nguyen <c...@adatao.com> wrote: > > Hi Kui, > > DDF (open sourced) also aims to do something similar, adding RDBMS idioms, > and is already implemented on top of Spark. > > One philosophy is that the DDF API aggressively hides the notion of parallel > datasets, exposing only (mutable) tables to users, on which they can apply R > and other familiar data mining/machine learning idioms, without having to > know about the distributed representation underneath. Now, you can get to the > underlying RDDs if you want to, simply by asking for it. > > This was launched at the July Spark Summit. See > http://spark-summit.org/2014/talk/distributed-dataframe-ddf-on-apache-spark-simplifying-big-data-for-the-rest-of-us > . > > Sent while mobile. Please excuse typos etc. > > On Sep 4, 2014 1:59 PM, "Shivaram Venkataraman" <shiva...@eecs.berkeley.edu> > wrote: > Thanks Kui. SparkR is a pretty young project, but there are a bunch of > things we are working on. One of the main features is to expose a data > frame API (https://sparkr.atlassian.net/browse/SPARKR-1) and we will > be integrating this with Spark's MLLib. At a high-level this will > allow R users to use a familiar API but make use of MLLib's efficient > distributed implementation. This is the same strategy used in Python > as well. > > Also we do hope to merge SparkR with mainline Spark -- we have a few > features to complete before that and plan to shoot for integration by > Spark 1.3. > > Thanks > Shivaram > > On Wed, Sep 3, 2014 at 9:24 PM, oppokui <oppo...@gmail.com> wrote: > > Thanks, Shivaram. > > > > No specific use case yet. We try to use R in our project as data scientest > > are all knowing R. We had a concern that how R handles the mass data. Spark > > does a better work on big data area, and Spark ML is focusing on predictive > > analysis area. Then we are thinking whether we can merge R and Spark > > together. We tried SparkR and it is pretty easy to use. But we didn’t see > > any feedback on this package in industry. It will be better if Spark team > > has R support just like scala/Java/Python. > > > > Another question is that MLlib will re-implement all famous data mining > > algorithms in Spark, then what is the purpose of using R? > > > > There is another technique for us H2O which support R natively. H2O is more > > friendly to data scientist. I saw H2O can also work on Spark (Sparkling > > Water). It is better than using SparkR? > > > > Thanks and Regards. > > > > Kui > > > > > > On Sep 4, 2014, at 1:47 AM, Shivaram Venkataraman > > <shiva...@eecs.berkeley.edu> wrote: > > > > Hi > > > > Do you have a specific use-case where SparkR doesn't work well ? We'd love > > to hear more about use-cases and features that can be improved with SparkR. > > > > Thanks > > Shivaram > > > > > > On Wed, Sep 3, 2014 at 3:19 AM, oppokui <oppo...@gmail.com> wrote: > >> > >> Does spark ML team have plan to support R script natively? There is a > >> SparkR project, but not from spark team. Spark ML used netlib-java to talk > >> with native fortran routines or use NumPy, why not try to use R in some > >> sense. > >> > >> R had lot of useful packages. If spark ML team can include R support, it > >> will be a very powerful. > >> > >> Any comment? > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > >> For additional commands, e-mail: user-h...@spark.apache.org > >> > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >