Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/6490#discussion_r31339416
  
    --- Diff: docs/sparkr.md ---
    @@ -0,0 +1,198 @@
    +---
    +layout: global
    +displayTitle: SparkR (R on Spark)
    +title: SparkR (R on Spark)
    +---
    +
    +* This will become a table of contents (this text will be scraped).
    +{:toc}
    +
    +# Overview
    +SparkR is an R package that provides a light-weight frontend to use Apache 
Spark from R.
    +In Spark {{site.SPARK_VERSION}}, SparkR provides a distributed data frame 
implementation that
    +supports operations similar to R data frames, 
[dplyr](https://github.com/hadley/dplyr) but on large
    +datasets.
    +
    +# SparkR DataFrames
    +
    +A DataFrame is a distributed collection of data organized into named 
columns. It is conceptually
    +equivalent to a table in a relational database or a data frame in R, but 
with richer
    +optimizations under the hood. DataFrames can be constructed from a wide 
array of sources such as:
    +structured data files, tables in Hive, external databases, or existing 
local R data frames.
    +
    +All of the examples on this page use sample data included in R or the 
Spark distribution and can be run using the `./bin/sparkR` shell.
    +
    +## Starting Up: SparkContext, SQLContext
    +
    +<div data-lang="r"  markdown="1">
    +The entry point into SparkR is the `SparkContext` which connects your R 
program to a Spark cluster.
    +You can create a `SparkContext` using `sparkR.init` and pass in options 
such as the application name
    +etc. Further, to work with DataFrames we will need a `SQLContext`, which 
can be created from the 
    +SparkContext. If you are working from the SparkR shell, the `SQLContext` 
and `SparkContext` should
    +already be created for you.
    +
    +{% highlight r %}
    +sc <- sparkR.init()
    +sqlContext <- sparkRSQL.init(sc)
    --- End diff --
    
    Can we hide `sparkR.init()` and call it in `sparkRSQL.init()` internally?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to