Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13751#discussion_r67604494 --- Diff: docs/sparkr.md --- @@ -14,29 +14,24 @@ supports operations like selection, filtering, aggregation etc. (similar to R da [dplyr](https://github.com/hadley/dplyr)) but on large datasets. SparkR also supports distributed machine learning using MLlib. -# SparkR DataFrames +# SparkDataFrames -A DataFrame is a distributed collection of data organized into named columns. It is conceptually +A SparkDataFrame is a distributed collection of data organized into named columns. It is conceptually equivalent to a table in a relational database or a data frame in R, but with richer -optimizations under the hood. DataFrames can be constructed from a wide array of sources such as: +optimizations under the hood. SparkDataFrames can be constructed from a wide array of sources such as: structured data files, tables in Hive, external databases, or existing local R data frames. All of the examples on this page use sample data included in R or the Spark distribution and can be run using the `./bin/sparkR` shell. -## Starting Up: SparkContext, SQLContext +## Starting Up: SparkSession --- End diff -- This causes the broken links, too. In `sparkR.R` line 95, ``` #' \url{http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparkcontext-sqlcontext}. ```
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org