GitHub user linbojin opened a pull request:
https://github.com/apache/spark/pull/14645
[MINOR] [DOC] Correct code snippet results in quick start documentation
## What changes were proposed in this pull request?
As README.md file is updated over time. Some code snippet outputs are not
correct based on new README.md file. For example:
```
scala> textFile.count()
res0: Long = 126
```
should be
```
scala> textFile.count()
res0: Long = 99
```
This pr is to correct these outputs so that new spark learners have a
correct reference.
Also, fixed a samll bug, inside current documentation, the outputs of
linesWithSpark.count() without and with cache are different (one is 15 and the
other is 19)
```
scala> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at
filter at <console>:27
scala> textFile.filter(line => line.contains("Spark")).count() // How many
lines contain "Spark"?
res3: Long = 15
...
scala> linesWithSpark.cache()
res7: linesWithSpark.type = MapPartitionsRDD[2] at filter at <console>:27
scala> linesWithSpark.count()
res8: Long = 19
```
## How was this patch tested?
manual test: run `$ SKIP_API=1 jekyll serve --watch`
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/linbojin/spark quick-start-documentation
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/14645.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #14645
----
commit f093e3a44a6447f619edd987bf30ee838899c578
Author: linbojin <[email protected]>
Date: 2016-08-15T06:26:39Z
correct result numbers inside quick start docs based on new README.md file
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]