spark git commit: [DOCS][MINOR] Fix a few broken links and typos, and, nit, use HTTPS more consistently

gurwls223 Tue, 21 Aug 2018 10:02:38 -0700

Repository: spark
Updated Branches:
  refs/heads/master d80063278 -> 35f7f5ce8



[DOCS][MINOR] Fix a few broken links and typos, and, nit, use HTTPS more 
consistently

## What changes were proposed in this pull request?

Fix a few broken links and typos, and, nit, use HTTPS more consistently esp. on 
scripts and Apache links

## How was this patch tested?

Doc build

Closes #22172 from srowen/DocTypo.

Authored-by: Sean Owen <sean.o...@databricks.com>
Signed-off-by: hyukjinkwon <gurwls...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/35f7f5ce
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/35f7f5ce
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/35f7f5ce

Branch: refs/heads/master
Commit: 35f7f5ce83984d8afe0b7955942baa04f2bef74f
Parents: d800632
Author: Sean Owen <sean.o...@databricks.com>
Authored: Wed Aug 22 01:02:17 2018 +0800
Committer: hyukjinkwon <gurwls...@apache.org>
Committed: Wed Aug 22 01:02:17 2018 +0800

----------------------------------------------------------------------
 docs/README.md                                 |  4 ++--
 docs/_layouts/404.html                         |  2 +-
 docs/_layouts/global.html                      |  6 +++---
 docs/building-spark.md                         |  8 ++++----
 docs/contributing-to-spark.md                  |  2 +-
 docs/index.md                                  | 16 ++++++++--------
 docs/ml-migration-guides.md                    |  2 +-
 docs/quick-start.md                            |  2 +-
 docs/rdd-programming-guide.md                  |  4 ++--
 docs/running-on-mesos.md                       |  2 +-
 docs/running-on-yarn.md                        |  2 +-
 docs/security.md                               |  6 +++---
 docs/sparkr.md                                 |  2 +-
 docs/sql-programming-guide.md                  |  6 +++---
 docs/streaming-kinesis-integration.md          |  2 +-
 docs/streaming-programming-guide.md            |  5 ++---
 docs/structured-streaming-programming-guide.md |  2 +-
 17 files changed, 36 insertions(+), 37 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/README.md
----------------------------------------------------------------------
diff --git a/docs/README.md b/docs/README.md
index dbea4d6..7da543d 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -2,7 +2,7 @@ Welcome to the Spark documentation!
 
 This readme will walk you through navigating and building the Spark 
documentation, which is included
 here with the Spark source code. You can also find documentation specific to 
release versions of
-Spark at http://spark.apache.org/documentation.html.
+Spark at https://spark.apache.org/documentation.html.
 
 Read on to learn more about viewing documentation in plain text (i.e., 
markdown) or building the
 documentation yourself. Why build it yourself? So that you have the docs that 
correspond to
@@ -79,7 +79,7 @@ jekyll plugin to run `build/sbt unidoc` before building the 
site so if you haven
 may take some time as it generates all of the scaladoc and javadoc using 
[Unidoc](https://github.com/sbt/sbt-unidoc).
 The jekyll plugin also generates the PySpark docs using 
[Sphinx](http://sphinx-doc.org/), SparkR docs
 using [roxygen2](https://cran.r-project.org/web/packages/roxygen2/index.html) 
and SQL docs
-using [MkDocs](http://www.mkdocs.org/).
+using [MkDocs](https://www.mkdocs.org/).
 
 NOTE: To skip the step of building and copying over the Scala, Java, Python, R 
and SQL API docs, run `SKIP_API=1
 jekyll build`. In addition, `SKIP_SCALADOC=1`, `SKIP_PYTHONDOC=1`, 
`SKIP_RDOC=1` and `SKIP_SQLDOC=1` can be used

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/_layouts/404.html
----------------------------------------------------------------------
diff --git a/docs/_layouts/404.html b/docs/_layouts/404.html
index 0446544..78f98b9 100755
--- a/docs/_layouts/404.html
+++ b/docs/_layouts/404.html
@@ -151,7 +151,7 @@
             <script>
                 var GOOG_FIXURL_LANG = (navigator.language || 
'').slice(0,2),GOOG_FIXURL_SITE = location.host;
             </script>
-            <script 
src="http://linkhelp.clients.google.com/tbproxy/lh/wm/fixurl.js";></script>
+            <script 
src="https://linkhelp.clients.google.com/tbproxy/lh/wm/fixurl.js";></script>
         </div>
     </body>
 </html>

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/_layouts/global.html
----------------------------------------------------------------------
diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html
index e5af5ae..88d549c 100755
--- a/docs/_layouts/global.html
+++ b/docs/_layouts/global.html
@@ -50,7 +50,7 @@
     </head>
     <body>
         <!--[if lt IE 7]>
-            <p class="chromeframe">You are using an outdated browser. <a 
href="http://browsehappy.com/";>Upgrade your browser today</a> or <a 
href="http://www.google.com/chromeframe/?redirect=true";>install Google Chrome 
Frame</a> to better experience this site.</p>
+            <p class="chromeframe">You are using an outdated browser. <a 
href="https://browsehappy.com/";>Upgrade your browser today</a> or <a 
href="http://www.google.com/chromeframe/?redirect=true";>install Google Chrome 
Frame</a> to better experience this site.</p>
         <![endif]-->
 
         <!-- This code is taken from 
http://twitter.github.com/bootstrap/examples/hero.html -->
@@ -114,8 +114,8 @@
                                 <li><a 
href="hardware-provisioning.html">Hardware Provisioning</a></li>
                                 <li class="divider"></li>
                                 <li><a href="building-spark.html">Building 
Spark</a></li>
-                                <li><a 
href="http://spark.apache.org/contributing.html";>Contributing to Spark</a></li>
-                                <li><a 
href="http://spark.apache.org/third-party-projects.html";>Third Party 
Projects</a></li>
+                                <li><a 
href="https://spark.apache.org/contributing.html";>Contributing to Spark</a></li>
+                                <li><a 
href="https://spark.apache.org/third-party-projects.html";>Third Party 
Projects</a></li>
                             </ul>
                         </li>
                     </ul>

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/building-spark.md
----------------------------------------------------------------------
diff --git a/docs/building-spark.md b/docs/building-spark.md
index affd7df..d3dfd49 100644
--- a/docs/building-spark.md
+++ b/docs/building-spark.md
@@ -45,7 +45,7 @@ Other build examples can be found below.
 ## Building a Runnable Distribution
 
 To create a Spark distribution like those distributed by the
-[Spark Downloads](http://spark.apache.org/downloads.html) page, and that is 
laid out so as
+[Spark Downloads](https://spark.apache.org/downloads.html) page, and that is 
laid out so as
 to be runnable, use `./dev/make-distribution.sh` in the project root 
directory. It can be configured
 with Maven profile settings and so on like the direct Maven build. Example:
 
@@ -164,7 +164,7 @@ prompt.
 Developers who compile Spark frequently may want to speed up compilation; 
e.g., by using Zinc
 (for developers who build with Maven) or by avoiding re-compilation of the 
assembly JAR (for
 developers who build with SBT).  For more information about how to do this, 
refer to the
-[Useful Developer Tools 
page](http://spark.apache.org/developer-tools.html#reducing-build-times).
+[Useful Developer Tools 
page](https://spark.apache.org/developer-tools.html#reducing-build-times).
 
 ## Encrypted Filesystems
 
@@ -182,7 +182,7 @@ to the `sharedSettings` val. See also [this 
PR](https://github.com/apache/spark/
 ## IntelliJ IDEA or Eclipse
 
 For help in setting up IntelliJ IDEA or Eclipse for Spark development, and 
troubleshooting, refer to the
-[Useful Developer Tools page](http://spark.apache.org/developer-tools.html).
+[Useful Developer Tools page](https://spark.apache.org/developer-tools.html).
 
 
 # Running Tests
@@ -203,7 +203,7 @@ The following is an example of a command to run the tests:
 ## Running Individual Tests
 
 For information about how to run individual tests, refer to the
-[Useful Developer Tools 
page](http://spark.apache.org/developer-tools.html#running-individual-tests).
+[Useful Developer Tools 
page](https://spark.apache.org/developer-tools.html#running-individual-tests).
 
 ## PySpark pip installable
 

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/contributing-to-spark.md
----------------------------------------------------------------------
diff --git a/docs/contributing-to-spark.md b/docs/contributing-to-spark.md
index 9252545..ede5584 100644
--- a/docs/contributing-to-spark.md
+++ b/docs/contributing-to-spark.md
@@ -5,4 +5,4 @@ title: Contributing to Spark
 
 The Spark team welcomes all forms of contributions, including bug reports, 
documentation or patches.
 For the newest information on how to contribute to the project, please read the
-[Contributing to Spark guide](http://spark.apache.org/contributing.html).
+[Contributing to Spark guide](https://spark.apache.org/contributing.html).

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/index.md
----------------------------------------------------------------------
diff --git a/docs/index.md b/docs/index.md
index 2f00941..40f628b 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -12,7 +12,7 @@ It also supports a rich set of higher-level tools including 
[Spark SQL](sql-prog
 
 # Downloading
 
-Get Spark from the [downloads page](http://spark.apache.org/downloads.html) of 
the project website. This documentation is for Spark version 
{{site.SPARK_VERSION}}. Spark uses Hadoop's client libraries for HDFS and YARN. 
Downloads are pre-packaged for a handful of popular Hadoop versions.
+Get Spark from the [downloads page](https://spark.apache.org/downloads.html) 
of the project website. This documentation is for Spark version 
{{site.SPARK_VERSION}}. Spark uses Hadoop's client libraries for HDFS and YARN. 
Downloads are pre-packaged for a handful of popular Hadoop versions.
 Users can also download a "Hadoop free" binary and run Spark with any Hadoop 
version
 [by augmenting Spark's classpath](hadoop-provided.html).
 Scala and Java users can include Spark in their projects using its Maven 
coordinates and in the future Python users can also install Spark from PyPI.
@@ -111,7 +111,7 @@ options for deployment:
   * [Amazon EC2](https://github.com/amplab/spark-ec2): scripts that let you 
launch a cluster on EC2 in about 5 minutes
   * [Standalone Deploy Mode](spark-standalone.html): launch a standalone 
cluster quickly without a third-party cluster manager
   * [Mesos](running-on-mesos.html): deploy a private cluster using
-      [Apache Mesos](http://mesos.apache.org)
+      [Apache Mesos](https://mesos.apache.org)
   * [YARN](running-on-yarn.html): deploy Spark on top of Hadoop NextGen (YARN)
   * [Kubernetes](running-on-kubernetes.html): deploy Spark on top of Kubernetes
 
@@ -127,20 +127,20 @@ options for deployment:
   * [Cloud Infrastructures](cloud-integration.html)
   * [OpenStack Swift](storage-openstack-swift.html)
 * [Building Spark](building-spark.html): build Spark using the Maven system
-* [Contributing to Spark](http://spark.apache.org/contributing.html)
-* [Third Party Projects](http://spark.apache.org/third-party-projects.html): 
related third party Spark projects
+* [Contributing to Spark](https://spark.apache.org/contributing.html)
+* [Third Party Projects](https://spark.apache.org/third-party-projects.html): 
related third party Spark projects
 
 **External Resources:**
 
-* [Spark Homepage](http://spark.apache.org)
-* [Spark Community](http://spark.apache.org/community.html) resources, 
including local meetups
+* [Spark Homepage](https://spark.apache.org)
+* [Spark Community](https://spark.apache.org/community.html) resources, 
including local meetups
 * [StackOverflow tag 
`apache-spark`](http://stackoverflow.com/questions/tagged/apache-spark)
-* [Mailing Lists](http://spark.apache.org/mailing-lists.html): ask questions 
about Spark here
+* [Mailing Lists](https://spark.apache.org/mailing-lists.html): ask questions 
about Spark here
 * [AMP Camps](http://ampcamp.berkeley.edu/): a series of training camps at UC 
Berkeley that featured talks and
   exercises about Spark, Spark Streaming, Mesos, and more. 
[Videos](http://ampcamp.berkeley.edu/6/),
   [slides](http://ampcamp.berkeley.edu/6/) and 
[exercises](http://ampcamp.berkeley.edu/6/exercises/) are
   available online for free.
-* [Code Examples](http://spark.apache.org/examples.html): more are also 
available in the `examples` subfolder of Spark 
([Scala]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/scala/org/apache/spark/examples),
+* [Code Examples](https://spark.apache.org/examples.html): more are also 
available in the `examples` subfolder of Spark 
([Scala]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/scala/org/apache/spark/examples),
  
[Java]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/java/org/apache/spark/examples),
  [Python]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/python),
  [R]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/r))

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/ml-migration-guides.md
----------------------------------------------------------------------
diff --git a/docs/ml-migration-guides.md b/docs/ml-migration-guides.md
index e473641..2047065 100644
--- a/docs/ml-migration-guides.md
+++ b/docs/ml-migration-guides.md
@@ -289,7 +289,7 @@ In the `spark.mllib` package, there were several breaking 
changes.  The first ch
 
 In the `spark.ml` package, the main API changes are from Spark SQL.  We list 
the most important changes here:
 
-* The old 
[SchemaRDD](http://spark.apache.org/docs/1.2.1/api/scala/index.html#org.apache.spark.sql.SchemaRDD)
 has been replaced with 
[DataFrame](api/scala/index.html#org.apache.spark.sql.DataFrame) with a 
somewhat modified API.  All algorithms in `spark.ml` which used to use 
SchemaRDD now use DataFrame.
+* The old 
[SchemaRDD](https://spark.apache.org/docs/1.2.1/api/scala/index.html#org.apache.spark.sql.SchemaRDD)
 has been replaced with 
[DataFrame](api/scala/index.html#org.apache.spark.sql.DataFrame) with a 
somewhat modified API.  All algorithms in `spark.ml` which used to use 
SchemaRDD now use DataFrame.
 * In Spark 1.2, we used implicit conversions from `RDD`s of `LabeledPoint` 
into `SchemaRDD`s by calling `import sqlContext._` where `sqlContext` was an 
instance of `SQLContext`.  These implicits have been moved, so we now call 
`import sqlContext.implicits._`.
 * Java APIs for SQL have also changed accordingly.  Please see the examples 
above and the [Spark SQL Programming Guide](sql-programming-guide.html) for 
details.
 

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/quick-start.md
----------------------------------------------------------------------
diff --git a/docs/quick-start.md b/docs/quick-start.md
index f1a2096..ef7af6c 100644
--- a/docs/quick-start.md
+++ b/docs/quick-start.md
@@ -12,7 +12,7 @@ interactive shell (in Python or Scala),
 then show how to write applications in Java, Scala, and Python.
 
 To follow along with this guide, first, download a packaged release of Spark 
from the
-[Spark website](http://spark.apache.org/downloads.html). Since we won't be 
using HDFS,
+[Spark website](https://spark.apache.org/downloads.html). Since we won't be 
using HDFS,
 you can download a package for any version of Hadoop.
 
 Note that, before Spark 2.0, the main programming interface of Spark was the 
Resilient Distributed Dataset (RDD). After Spark 2.0, RDDs are replaced by 
Dataset, which is strongly-typed like an RDD, but with richer optimizations 
under the hood. The RDD interface is still supported, and you can get a more 
detailed reference at the [RDD programming guide](rdd-programming-guide.html). 
However, we highly recommend you to switch to use Dataset, which has better 
performance than RDD. See the [SQL programming 
guide](sql-programming-guide.html) to get more information about Dataset.

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/rdd-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/rdd-programming-guide.md b/docs/rdd-programming-guide.md
index b642409..d95b757 100644
--- a/docs/rdd-programming-guide.md
+++ b/docs/rdd-programming-guide.md
@@ -106,7 +106,7 @@ You can also use `bin/pyspark` to launch an interactive 
Python shell.
 
 If you wish to access HDFS data, you need to use a build of PySpark linking
 to your version of HDFS.
-[Prebuilt packages](http://spark.apache.org/downloads.html) are also available 
on the Spark homepage
+[Prebuilt packages](https://spark.apache.org/downloads.html) are also 
available on the Spark homepage
 for common HDFS versions.
 
 Finally, you need to import some Spark classes into your program. Add the 
following line:
@@ -1569,7 +1569,7 @@ as Spark does not support two contexts running 
concurrently in the same program.
 
 # Where to Go from Here
 
-You can see some [example Spark 
programs](http://spark.apache.org/examples.html) on the Spark website.
+You can see some [example Spark 
programs](https://spark.apache.org/examples.html) on the Spark website.
 In addition, Spark includes several samples in the `examples` directory
 
([Scala]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/scala/org/apache/spark/examples),
  
[Java]({{site.SPARK_GITHUB_URL}}/tree/master/examples/src/main/java/org/apache/spark/examples),

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/running-on-mesos.md
----------------------------------------------------------------------
diff --git a/docs/running-on-mesos.md b/docs/running-on-mesos.md
index 3e76d47..b473e65 100644
--- a/docs/running-on-mesos.md
+++ b/docs/running-on-mesos.md
@@ -672,7 +672,7 @@ See the [configuration page](configuration.html) for 
information on Spark config
   <td><code>spark.mesos.dispatcher.historyServer.url</code></td>
   <td><code>(none)</code></td>
   <td>
-    Set the URL of the <a 
href="http://spark.apache.org/docs/latest/monitoring.html#viewing-after-the-fact";>history
+    Set the URL of the <a href="monitoring.html#viewing-after-the-fact">history
     server</a>.  The dispatcher will then link each driver to its entry
     in the history server.
   </td>

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/running-on-yarn.md
----------------------------------------------------------------------
diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md
index 1c1f40c..e3d67c3 100644
--- a/docs/running-on-yarn.md
+++ b/docs/running-on-yarn.md
@@ -61,7 +61,7 @@ In `cluster` mode, the driver runs on a different machine 
than the client, so `S
 # Preparations
 
 Running Spark on YARN requires a binary distribution of Spark which is built 
with YARN support.
-Binary distributions can be downloaded from the [downloads 
page](http://spark.apache.org/downloads.html) of the project website.
+Binary distributions can be downloaded from the [downloads 
page](https://spark.apache.org/downloads.html) of the project website.
 To build Spark yourself, refer to [Building Spark](building-spark.html).
 
 To make Spark runtime jars accessible from YARN side, you can specify 
`spark.yarn.archive` or `spark.yarn.jars`. For details please refer to [Spark 
Properties](running-on-yarn.html#spark-properties). If neither 
`spark.yarn.archive` nor `spark.yarn.jars` is specified, Spark will create a 
zip file with all jars under `$SPARK_HOME/jars` and upload it to the 
distributed cache.

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/security.md
----------------------------------------------------------------------
diff --git a/docs/security.md b/docs/security.md
index c8eec73..7fb3e17 100644
--- a/docs/security.md
+++ b/docs/security.md
@@ -49,7 +49,7 @@ respectively by default) are restricted to hosts that are 
trusted to submit jobs
 
 Spark supports AES-based encryption for RPC connections. For encryption to be 
enabled, RPC
 authentication must also be enabled and properly configured. AES encryption 
uses the
-[Apache Commons Crypto](http://commons.apache.org/proper/commons-crypto/) 
library, and Spark's
+[Apache Commons Crypto](https://commons.apache.org/proper/commons-crypto/) 
library, and Spark's
 configuration system allows access to that library's configuration for 
advanced users.
 
 There is also support for SASL-based encryption, although it should be 
considered deprecated. It
@@ -169,7 +169,7 @@ The following settings cover enabling encryption for data 
written to disk:
 
 ## Authentication and Authorization
 
-Enabling authentication for the Web UIs is done using [javax servlet 
filters](http://docs.oracle.com/javaee/6/api/javax/servlet/Filter.html).
+Enabling authentication for the Web UIs is done using [javax servlet 
filters](https://docs.oracle.com/javaee/6/api/javax/servlet/Filter.html).
 You will need a filter that implements the authentication method you want to 
deploy. Spark does not
 provide any built-in authentication filters.
 
@@ -492,7 +492,7 @@ distributed with the application using the `--files` 
command line argument (or t
 configuration should just reference the file name with no absolute path.
 
 Distributing local key stores this way may require the files to be staged in 
HDFS (or other similar
-distributed file system used by the cluster), so it's recommended that the 
undelying file system be
+distributed file system used by the cluster), so it's recommended that the 
underlying file system be
 configured with security in mind (e.g. by enabling authentication and wire 
encryption).
 
 ### Standalone mode

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/sparkr.md
----------------------------------------------------------------------
diff --git a/docs/sparkr.md b/docs/sparkr.md
index 84e9b4a..b4248e8 100644
--- a/docs/sparkr.md
+++ b/docs/sparkr.md
@@ -128,7 +128,7 @@ head(df)
 SparkR supports operating on a variety of data sources through the 
`SparkDataFrame` interface. This section describes the general methods for 
loading and saving data using Data Sources. You can check the Spark SQL 
programming guide for more [specific 
options](sql-programming-guide.html#manually-specifying-options) that are 
available for the built-in data sources.
 
 The general method for creating SparkDataFrames from data sources is 
`read.df`. This method takes in the path for the file to load and the type of 
data source, and the currently active SparkSession will be used automatically.
-SparkR supports reading JSON, CSV and Parquet files natively, and through 
packages available from sources like [Third Party 
Projects](http://spark.apache.org/third-party-projects.html), you can find data 
source connectors for popular file formats like Avro. These packages can either 
be added by
+SparkR supports reading JSON, CSV and Parquet files natively, and through 
packages available from sources like [Third Party 
Projects](https://spark.apache.org/third-party-projects.html), you can find 
data source connectors for popular file formats like Avro. These packages can 
either be added by
 specifying `--packages` with `spark-submit` or `sparkR` commands, or if 
initializing SparkSession with `sparkPackages` parameter when in an interactive 
R shell or from RStudio.
 
 <div data-lang="r" markdown="1">

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/sql-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index d9ebc3c..8e308d5 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1796,7 +1796,7 @@ strings, e.g. integer indices. See 
[pandas.DataFrame](https://pandas.pydata.org/
 on how to label columns when constructing a `pandas.DataFrame`.
 
 Note that all data for a group will be loaded into memory before the function 
is applied. This can
-lead to out of memory exceptons, especially if the group sizes are skewed. The 
configuration for
+lead to out of memory exceptions, especially if the group sizes are skewed. 
The configuration for
 [maxRecordsPerBatch](#setting-arrow-batch-size) is not applied on groups and 
it is up to the user
 to ensure that the grouped data will fit into the available memory.
 
@@ -1876,7 +1876,7 @@ working with timestamps in `pandas_udf`s to get the best 
performance, see
 
 ## Upgrading From Spark SQL 2.3 to 2.4
 
-  - Since Spark 2.4, Spark will evaluate the set operations referenced in a 
query by following a precedence rule as per the SQL standard. If the order is 
not specified by parentheses, set operations are performed from left to right 
with the exception that all INTERSECT operations are performed before any 
UNION, EXCEPT or MINUS operations. The old behaviour of giving equal precedence 
to all the set operations are preserved under a newly added configuaration 
`spark.sql.legacy.setopsPrecedence.enabled` with a default value of `false`. 
When this property is set to `true`, spark will evaluate the set operators from 
left to right as they appear in the query given no explicit ordering is 
enforced by usage of parenthesis.
+  - Since Spark 2.4, Spark will evaluate the set operations referenced in a 
query by following a precedence rule as per the SQL standard. If the order is 
not specified by parentheses, set operations are performed from left to right 
with the exception that all INTERSECT operations are performed before any 
UNION, EXCEPT or MINUS operations. The old behaviour of giving equal precedence 
to all the set operations are preserved under a newly added configuration 
`spark.sql.legacy.setopsPrecedence.enabled` with a default value of `false`. 
When this property is set to `true`, spark will evaluate the set operators from 
left to right as they appear in the query given no explicit ordering is 
enforced by usage of parenthesis.
   - Since Spark 2.4, Spark will display table description column Last Access 
value as UNKNOWN when the value was Jan 01 1970.
   - Since Spark 2.4, Spark maximizes the usage of a vectorized ORC reader for 
ORC files by default. To do that, `spark.sql.orc.impl` and 
`spark.sql.orc.filterPushdown` change their default values to `native` and 
`true` respectively.
   - In PySpark, when Arrow optimization is enabled, previously `toPandas` just 
failed when Arrow optimization is unable to be used whereas `createDataFrame` 
from Pandas DataFrame allowed the fallback to non-optimization. Now, both 
`toPandas` and `createDataFrame` from Pandas DataFrame allow the fallback by 
default, which can be switched off by 
`spark.sql.execution.arrow.fallback.enabled`.
@@ -2162,7 +2162,7 @@ See the API docs for `SQLContext.read` (
   <a href="api/python/pyspark.sql.html#pyspark.sql.SQLContext.read">Python</a>
 ) and `DataFrame.write` (
   <a 
href="api/scala/index.html#org.apache.spark.sql.DataFrame@write:DataFrameWriter">Scala</a>,
-  <a href="api/java/org/apache/spark/sql/DataFrame.html#write()">Java</a>,
+  <a href="api/java/org/apache/spark/sql/Dataset.html#write()">Java</a>,
   <a href="api/python/pyspark.sql.html#pyspark.sql.DataFrame.write">Python</a>
 ) more information.
 

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/streaming-kinesis-integration.md
----------------------------------------------------------------------
diff --git a/docs/streaming-kinesis-integration.md 
b/docs/streaming-kinesis-integration.md
index 678b064..6a52e8a 100644
--- a/docs/streaming-kinesis-integration.md
+++ b/docs/streaming-kinesis-integration.md
@@ -196,7 +196,7 @@ A Kinesis stream can be set up at one of the valid Kinesis 
endpoints with 1 or m
 #### Running the Example
 To run the example,
 
-- Download a Spark binary from the [download 
site](http://spark.apache.org/downloads.html).
+- Download a Spark binary from the [download 
site](https://spark.apache.org/downloads.html).
 
 - Set up Kinesis stream (see earlier section) within AWS. Note the name of the 
Kinesis stream and the endpoint URL corresponding to the region where the 
stream was created.
 

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/streaming-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/streaming-programming-guide.md 
b/docs/streaming-programming-guide.md
index 118b053..0ca0f2a 100644
--- a/docs/streaming-programming-guide.md
+++ b/docs/streaming-programming-guide.md
@@ -915,8 +915,7 @@ JavaPairDStream<String, Integer> runningCounts = 
pairs.updateStateByKey(updateFu
 The update function will be called for each word, with `newValues` having a 
sequence of 1's (from
 the `(word, 1)` pairs) and the `runningCount` having the previous count. For 
the complete
 Java code, take a look at the example
-[JavaStatefulNetworkWordCount.java]({{site.SPARK_GITHUB_URL}}/blob/v{{site.SPARK_VERSION_SHORT}}/examples/src/main/java/org/apache/spark/examples/streaming
-/JavaStatefulNetworkWordCount.java).
+[JavaStatefulNetworkWordCount.java]({{site.SPARK_GITHUB_URL}}/blob/v{{site.SPARK_VERSION_SHORT}}/examples/src/main/java/org/apache/spark/examples/streaming/JavaStatefulNetworkWordCount.java).
 
 </div>
 <div data-lang="python" markdown="1">
@@ -2470,7 +2469,7 @@ additional effort may be necessary to achieve 
exactly-once semantics. There are
     - [Kafka Integration Guide](streaming-kafka-integration.html)
     - [Kinesis Integration Guide](streaming-kinesis-integration.html)
     - [Custom Receiver Guide](streaming-custom-receivers.html)
-* Third-party DStream data sources can be found in [Third Party 
Projects](http://spark.apache.org/third-party-projects.html)
+* Third-party DStream data sources can be found in [Third Party 
Projects](https://spark.apache.org/third-party-projects.html)
 * API documentation
   - Scala docs
     * 
[StreamingContext](api/scala/index.html#org.apache.spark.streaming.StreamingContext)
 and

http://git-wip-us.apache.org/repos/asf/spark/blob/35f7f5ce/docs/structured-streaming-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index b832f71..355a6cc 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -17,7 +17,7 @@ In this guide, we are going to walk you through the 
programming model and the AP
 # Quick Example
 Letâs say you want to maintain a running word count of text data received 
from a data server listening on a TCP socket. Letâs see how you can express 
this using Structured Streaming. You can see the full code in
 
[Scala]({{site.SPARK_GITHUB_URL}}/blob/v{{site.SPARK_VERSION_SHORT}}/examples/src/main/scala/org/apache/spark/examples/sql/streaming/StructuredNetworkWordCount.scala)/[Java]({{site.SPARK_GITHUB_URL}}/blob/v{{site.SPARK_VERSION_SHORT}}/examples/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredNetworkWordCount.java)/[Python]({{site.SPARK_GITHUB_URL}}/blob/v{{site.SPARK_VERSION_SHORT}}/examples/src/main/python/sql/streaming/structured_network_wordcount.py)/[R]({{site.SPARK_GITHUB_URL}}/blob/v{{site.SPARK_VERSION_SHORT}}/examples/src/main/r/streaming/structured_network_wordcount.R).
-And if you [download Spark](http://spark.apache.org/downloads.html), you can 
directly [run the example](index.html#running-the-examples-and-shell). In any 
case, letâs walk through the example step-by-step and understand how it 
works. First, we have to import the necessary classes and create a local 
SparkSession, the starting point of all functionalities related to Spark.
+And if you [download Spark](https://spark.apache.org/downloads.html), you can 
directly [run the example](index.html#running-the-examples-and-shell). In any 
case, letâs walk through the example step-by-step and understand how it 
works. First, we have to import the necessary classes and create a local 
SparkSession, the starting point of all functionalities related to Spark.
 
 <div class="codetabs">
 <div data-lang="scala"  markdown="1">


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [DOCS][MINOR] Fix a few broken links and typos, and, nit, use HTTPS more consistently

Reply via email to