viirya commented on code in PR #493: URL: https://github.com/apache/spark-website/pull/493#discussion_r1426302479
########## examples.md: ########## @@ -19,12 +19,18 @@ On top of Spark’s RDD API, high level APIs are provided, e.g. [DataFrame API](https://spark.apache.org/docs/latest/sql-programming-guide.html#datasets-and-dataframes) and [Machine Learning API](https://spark.apache.org/docs/latest/mllib-guide.html). These high level APIs provide a concise way to conduct certain data operations. -In this page, we will show examples using RDD API as well as examples using high level APIs. -<h2>RDD API examples</h2> +<h2>DataFrame API examples</h2> +<p> +In Spark, a <a href="https://spark.apache.org/docs/latest/sql-programming-guide.html#dataframes">DataFrame</a> +is a distributed collection of data organized into named columns. +Users can use DataFrame API to perform various relational operations on both external +data sources and Spark’s built-in distributed collections without providing specific procedures for processing data. +Also, programs based on DataFrame API will be automatically optimized by Spark’s built-in optimizer, Catalyst. +</p> <h3>Word count</h3> -<p>In this example, we use a few transformations to build a dataset of (String, Int) pairs called <code>counts</code> and then save it to a file.</p> +<p>In this example, we use a few transformations to build a dataset of (String, Long) pairs and then save it to a file.</p> Review Comment: Hmm, I think `transformations` are terms used for RDD operations (i.e, `transformations` and `actions` as this page mentioned above)? For DataFrame, maybe `relational operations`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org