Repository: spark
Updated Branches:
  refs/heads/master 8fdc6ce40 -> 6f0988b12


[MINOR][DOC] Correct code snippet results in quick start documentation

## What changes were proposed in this pull request?

As README.md file is updated over time. Some code snippet outputs are not 
correct based on new README.md file. For example:
```
scala> textFile.count()
res0: Long = 126
```
should be
```
scala> textFile.count()
res0: Long = 99
```
This pr is to add comments to point out this problem so that new spark learners 
have a correct reference.
Also, fixed a samll bug, inside current documentation, the outputs of 
linesWithSpark.count() without and with cache are different (one is 15 and the 
other is 19)
```
scala> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at 
filter at <console>:27

scala> textFile.filter(line => line.contains("Spark")).count() // How many 
lines contain "Spark"?
res3: Long = 15

...

scala> linesWithSpark.cache()
res7: linesWithSpark.type = MapPartitionsRDD[2] at filter at <console>:27

scala> linesWithSpark.count()
res8: Long = 19
```

## How was this patch tested?

manual test:  run `$ SKIP_API=1 jekyll serve --watch`

Author: linbojin <[email protected]>

Closes #14645 from linbojin/quick-start-documentation.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6f0988b1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6f0988b1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6f0988b1

Branch: refs/heads/master
Commit: 6f0988b1293a5e5ee3620b2727ed969155d7ac0d
Parents: 8fdc6ce
Author: linbojin <[email protected]>
Authored: Tue Aug 16 11:37:54 2016 +0100
Committer: Sean Owen <[email protected]>
Committed: Tue Aug 16 11:37:54 2016 +0100

----------------------------------------------------------------------
 docs/quick-start.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/6f0988b1/docs/quick-start.md
----------------------------------------------------------------------
diff --git a/docs/quick-start.md b/docs/quick-start.md
index 1b961fd..a29e28f 100644
--- a/docs/quick-start.md
+++ b/docs/quick-start.md
@@ -40,7 +40,7 @@ RDDs have _[actions](programming-guide.html#actions)_, which 
return values, and
 
 {% highlight scala %}
 scala> textFile.count() // Number of items in this RDD
-res0: Long = 126
+res0: Long = 126 // May be different from yours as README.md will change over 
time, similar to other outputs
 
 scala> textFile.first() // First item in this RDD
 res1: String = # Apache Spark
@@ -184,10 +184,10 @@ scala> linesWithSpark.cache()
 res7: linesWithSpark.type = MapPartitionsRDD[2] at filter at <console>:27
 
 scala> linesWithSpark.count()
-res8: Long = 19
+res8: Long = 15
 
 scala> linesWithSpark.count()
-res9: Long = 19
+res9: Long = 15
 {% endhighlight %}
 
 It may seem silly to use Spark to explore and cache a 100-line text file. The 
interesting part is
@@ -202,10 +202,10 @@ a cluster, as described in the [programming 
guide](programming-guide.html#initia
 >>> linesWithSpark.cache()
 
 >>> linesWithSpark.count()
-19
+15
 
 >>> linesWithSpark.count()
-19
+15
 {% endhighlight %}
 
 It may seem silly to use Spark to explore and cache a 100-line text file. The 
interesting part is


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to