Repository: spark
Updated Branches:
  refs/heads/master 41d5aaec8 -> c5daccb1d


[MINOR] Update all DOI links to preferred resolver

## What changes were proposed in this pull request?

The DOI foundation recommends [this new 
resolver](https://www.doi.org/doi_handbook/3_Resolution.html#3.8). Accordingly, 
this PR re`sed`s all static DOI links ;-)

## How was this patch tested?

It wasn't, since it seems as safe as a "[typo 
fix](https://spark.apache.org/contributing.html)".

In case any of the files is included from other projects, and should be updated 
there, please let me know.

Closes #23129 from katrinleinweber/resolve-DOIs-securely.

Authored-by: Katrin Leinweber <[email protected]>
Signed-off-by: Sean Owen <[email protected]>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c5daccb1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c5daccb1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c5daccb1

Branch: refs/heads/master
Commit: c5daccb1dafca528ccb4be65d63c943bf9a7b0f2
Parents: 41d5aae
Author: Katrin Leinweber <[email protected]>
Authored: Sun Nov 25 17:43:55 2018 -0600
Committer: Sean Owen <[email protected]>
Committed: Sun Nov 25 17:43:55 2018 -0600

----------------------------------------------------------------------
 R/pkg/R/stats.R                                           |  4 ++--
 .../scala/org/apache/spark/api/java/JavaPairRDD.scala     |  6 +++---
 .../scala/org/apache/spark/api/java/JavaRDDLike.scala     |  2 +-
 .../scala/org/apache/spark/rdd/PairRDDFunctions.scala     |  8 ++++----
 core/src/main/scala/org/apache/spark/rdd/RDD.scala        |  4 ++--
 docs/ml-classification-regression.md                      |  4 ++--
 docs/ml-collaborative-filtering.md                        |  4 ++--
 docs/ml-frequent-pattern-mining.md                        |  8 ++++----
 docs/mllib-collaborative-filtering.md                     |  4 ++--
 docs/mllib-frequent-pattern-mining.md                     |  6 +++---
 docs/mllib-isotonic-regression.md                         |  4 ++--
 .../scala/org/apache/spark/ml/clustering/KMeans.scala     |  2 +-
 .../src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala |  4 ++--
 .../main/scala/org/apache/spark/ml/fpm/PrefixSpan.scala   |  2 +-
 .../scala/org/apache/spark/ml/recommendation/ALS.scala    |  2 +-
 .../main/scala/org/apache/spark/mllib/fpm/FPGrowth.scala  |  4 ++--
 .../scala/org/apache/spark/mllib/fpm/PrefixSpan.scala     |  2 +-
 .../apache/spark/mllib/linalg/distributed/RowMatrix.scala |  2 +-
 .../scala/org/apache/spark/mllib/recommendation/ALS.scala |  2 +-
 python/pyspark/ml/fpm.py                                  |  6 +++---
 python/pyspark/ml/recommendation.py                       |  2 +-
 python/pyspark/mllib/fpm.py                               |  2 +-
 python/pyspark/mllib/linalg/distributed.py                |  2 +-
 python/pyspark/rdd.py                                     |  2 +-
 python/pyspark/sql/dataframe.py                           |  4 ++--
 .../spark/sql/catalyst/util/QuantileSummaries.scala       |  2 +-
 .../org/apache/spark/sql/DataFrameStatFunctions.scala     | 10 +++++-----
 .../apache/spark/sql/execution/stat/FrequentItems.scala   |  2 +-
 .../apache/spark/sql/execution/stat/StatFunctions.scala   |  2 +-
 29 files changed, 54 insertions(+), 54 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/R/pkg/R/stats.R
----------------------------------------------------------------------
diff --git a/R/pkg/R/stats.R b/R/pkg/R/stats.R
index 497f18c..7252351 100644
--- a/R/pkg/R/stats.R
+++ b/R/pkg/R/stats.R
@@ -109,7 +109,7 @@ setMethod("corr",
 #'
 #' Finding frequent items for columns, possibly with false positives.
 #' Using the frequent element count algorithm described in
-#' \url{http://dx.doi.org/10.1145/762471.762473}, proposed by Karp, Schenker, 
and Papadimitriou.
+#' \url{https://doi.org/10.1145/762471.762473}, proposed by Karp, Schenker, 
and Papadimitriou.
 #'
 #' @param x A SparkDataFrame.
 #' @param cols A vector column names to search frequent items in.
@@ -143,7 +143,7 @@ setMethod("freqItems", signature(x = "SparkDataFrame", cols 
= "character"),
 #' *exact* rank of x is close to (p * N). More precisely,
 #'   floor((p - err) * N) <= rank(x) <= ceil((p + err) * N).
 #' This method implements a variation of the Greenwald-Khanna algorithm (with 
some speed
-#' optimizations). The algorithm was first present in 
[[http://dx.doi.org/10.1145/375663.375670
+#' optimizations). The algorithm was first present in 
[[https://doi.org/10.1145/375663.375670
 #' Space-efficient Online Computation of Quantile Summaries]] by Greenwald and 
Khanna.
 #' Note that NA values will be ignored in numerical columns before 
calculation. For
 #'   columns only containing NA values, an empty list is returned.

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala 
b/core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala
index 80a4f84..50ed8d9 100644
--- a/core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala
+++ b/core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala
@@ -952,7 +952,7 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
    *
    * The algorithm used is based on streamlib's implementation of "HyperLogLog 
in Practice:
    * Algorithmic Engineering of a State of The Art Cardinality Estimation 
Algorithm", available
-   * <a href="http://dx.doi.org/10.1145/2452376.2452456";>here</a>.
+   * <a href="https://doi.org/10.1145/2452376.2452456";>here</a>.
    *
    * @param relativeSD Relative accuracy. Smaller values create counters that 
require more space.
    *                   It must be greater than 0.000017.
@@ -969,7 +969,7 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
    *
    * The algorithm used is based on streamlib's implementation of "HyperLogLog 
in Practice:
    * Algorithmic Engineering of a State of The Art Cardinality Estimation 
Algorithm", available
-   * <a href="http://dx.doi.org/10.1145/2452376.2452456";>here</a>.
+   * <a href="https://doi.org/10.1145/2452376.2452456";>here</a>.
    *
    * @param relativeSD Relative accuracy. Smaller values create counters that 
require more space.
    *                   It must be greater than 0.000017.
@@ -985,7 +985,7 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
    *
    * The algorithm used is based on streamlib's implementation of "HyperLogLog 
in Practice:
    * Algorithmic Engineering of a State of The Art Cardinality Estimation 
Algorithm", available
-   * <a href="http://dx.doi.org/10.1145/2452376.2452456";>here</a>.
+   * <a href="https://doi.org/10.1145/2452376.2452456";>here</a>.
    *
    * @param relativeSD Relative accuracy. Smaller values create counters that 
require more space.
    *                   It must be greater than 0.000017.

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala 
b/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
index 91ae100..5ba8219 100644
--- a/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
+++ b/core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
@@ -685,7 +685,7 @@ trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] extends 
Serializable {
    *
    * The algorithm used is based on streamlib's implementation of "HyperLogLog 
in Practice:
    * Algorithmic Engineering of a State of The Art Cardinality Estimation 
Algorithm", available
-   * <a href="http://dx.doi.org/10.1145/2452376.2452456";>here</a>.
+   * <a href="https://doi.org/10.1145/2452376.2452456";>here</a>.
    *
    * @param relativeSD Relative accuracy. Smaller values create counters that 
require more space.
    *                   It must be greater than 0.000017.

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala 
b/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
index e68c6b1..4bf4f08 100644
--- a/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
@@ -394,7 +394,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
    *
    * The algorithm used is based on streamlib's implementation of "HyperLogLog 
in Practice:
    * Algorithmic Engineering of a State of The Art Cardinality Estimation 
Algorithm", available
-   * <a href="http://dx.doi.org/10.1145/2452376.2452456";>here</a>.
+   * <a href="https://doi.org/10.1145/2452376.2452456";>here</a>.
    *
    * The relative accuracy is approximately `1.054 / sqrt(2^p)`. Setting a 
nonzero (`sp` is
    * greater than `p`) would trigger sparse representation of registers, which 
may reduce the
@@ -436,7 +436,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
    *
    * The algorithm used is based on streamlib's implementation of "HyperLogLog 
in Practice:
    * Algorithmic Engineering of a State of The Art Cardinality Estimation 
Algorithm", available
-   * <a href="http://dx.doi.org/10.1145/2452376.2452456";>here</a>.
+   * <a href="https://doi.org/10.1145/2452376.2452456";>here</a>.
    *
    * @param relativeSD Relative accuracy. Smaller values create counters that 
require more space.
    *                   It must be greater than 0.000017.
@@ -456,7 +456,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
    *
    * The algorithm used is based on streamlib's implementation of "HyperLogLog 
in Practice:
    * Algorithmic Engineering of a State of The Art Cardinality Estimation 
Algorithm", available
-   * <a href="http://dx.doi.org/10.1145/2452376.2452456";>here</a>.
+   * <a href="https://doi.org/10.1145/2452376.2452456";>here</a>.
    *
    * @param relativeSD Relative accuracy. Smaller values create counters that 
require more space.
    *                   It must be greater than 0.000017.
@@ -473,7 +473,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])
    *
    * The algorithm used is based on streamlib's implementation of "HyperLogLog 
in Practice:
    * Algorithmic Engineering of a State of The Art Cardinality Estimation 
Algorithm", available
-   * <a href="http://dx.doi.org/10.1145/2452376.2452456";>here</a>.
+   * <a href="https://doi.org/10.1145/2452376.2452456";>here</a>.
    *
    * @param relativeSD Relative accuracy. Smaller values create counters that 
require more space.
    *                   It must be greater than 0.000017.

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/core/src/main/scala/org/apache/spark/rdd/RDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
index 743e344..6a25ee2 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
@@ -1258,7 +1258,7 @@ abstract class RDD[T: ClassTag](
    *
    * The algorithm used is based on streamlib's implementation of "HyperLogLog 
in Practice:
    * Algorithmic Engineering of a State of The Art Cardinality Estimation 
Algorithm", available
-   * <a href="http://dx.doi.org/10.1145/2452376.2452456";>here</a>.
+   * <a href="https://doi.org/10.1145/2452376.2452456";>here</a>.
    *
    * The relative accuracy is approximately `1.054 / sqrt(2^p)`. Setting a 
nonzero (`sp` is greater
    * than `p`) would trigger sparse representation of registers, which may 
reduce the memory
@@ -1290,7 +1290,7 @@ abstract class RDD[T: ClassTag](
    *
    * The algorithm used is based on streamlib's implementation of "HyperLogLog 
in Practice:
    * Algorithmic Engineering of a State of The Art Cardinality Estimation 
Algorithm", available
-   * <a href="http://dx.doi.org/10.1145/2452376.2452456";>here</a>.
+   * <a href="https://doi.org/10.1145/2452376.2452456";>here</a>.
    *
    * @param relativeSD Relative accuracy. Smaller values create counters that 
require more space.
    *                   It must be greater than 0.000017.

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/docs/ml-classification-regression.md
----------------------------------------------------------------------
diff --git a/docs/ml-classification-regression.md 
b/docs/ml-classification-regression.md
index b3d1090..42912a2 100644
--- a/docs/ml-classification-regression.md
+++ b/docs/ml-classification-regression.md
@@ -941,9 +941,9 @@ Essentially isotonic regression is a
 best fitting the original data points.
 
 We implement a
-[pool adjacent violators algorithm](http://doi.org/10.1198/TECH.2010.10111)
+[pool adjacent violators algorithm](https://doi.org/10.1198/TECH.2010.10111)
 which uses an approach to
-[parallelizing isotonic 
regression](http://doi.org/10.1007/978-3-642-99789-1_10).
+[parallelizing isotonic 
regression](https://doi.org/10.1007/978-3-642-99789-1_10).
 The training input is a DataFrame which contains three columns
 label, features and weight. Additionally, IsotonicRegression algorithm has one
 optional parameter called $isotonic$ defaulting to true.

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/docs/ml-collaborative-filtering.md
----------------------------------------------------------------------
diff --git a/docs/ml-collaborative-filtering.md 
b/docs/ml-collaborative-filtering.md
index 8b0f287..5864664 100644
--- a/docs/ml-collaborative-filtering.md
+++ b/docs/ml-collaborative-filtering.md
@@ -41,7 +41,7 @@ for example, users giving ratings to movies.
 
 It is common in many real-world use cases to only have access to *implicit 
feedback* (e.g. views,
 clicks, purchases, likes, shares etc.). The approach used in `spark.ml` to 
deal with such data is taken
-from [Collaborative Filtering for Implicit Feedback 
Datasets](http://dx.doi.org/10.1109/ICDM.2008.22).
+from [Collaborative Filtering for Implicit Feedback 
Datasets](https://doi.org/10.1109/ICDM.2008.22).
 Essentially, instead of trying to model the matrix of ratings directly, this 
approach treats the data
 as numbers representing the *strength* in observations of user actions (such 
as the number of clicks,
 or the cumulative duration someone spent viewing a movie). Those numbers are 
then related to the level of
@@ -55,7 +55,7 @@ We scale the regularization parameter `regParam` in solving 
each least squares p
 the number of ratings the user generated in updating user factors,
 or the number of ratings the product received in updating product factors.
 This approach is named "ALS-WR" and discussed in the paper
-"[Large-Scale Parallel Collaborative Filtering for the Netflix 
Prize](http://dx.doi.org/10.1007/978-3-540-68880-8_32)".
+"[Large-Scale Parallel Collaborative Filtering for the Netflix 
Prize](https://doi.org/10.1007/978-3-540-68880-8_32)".
 It makes `regParam` less dependent on the scale of the dataset, so we can 
apply the
 best parameter learned from a sampled subset to the full dataset and expect 
similar performance.
 

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/docs/ml-frequent-pattern-mining.md
----------------------------------------------------------------------
diff --git a/docs/ml-frequent-pattern-mining.md 
b/docs/ml-frequent-pattern-mining.md
index c2043d4..f613664 100644
--- a/docs/ml-frequent-pattern-mining.md
+++ b/docs/ml-frequent-pattern-mining.md
@@ -18,7 +18,7 @@ for more information.
 ## FP-Growth
 
 The FP-growth algorithm is described in the paper
-[Han et al., Mining frequent patterns without candidate 
generation](http://dx.doi.org/10.1145/335191.335372),
+[Han et al., Mining frequent patterns without candidate 
generation](https://doi.org/10.1145/335191.335372),
 where "FP" stands for frequent pattern.
 Given a dataset of transactions, the first step of FP-growth is to calculate 
item frequencies and identify frequent items.
 Different from [Apriori-like](http://en.wikipedia.org/wiki/Apriori_algorithm) 
algorithms designed for the same purpose,
@@ -26,7 +26,7 @@ the second step of FP-growth uses a suffix tree (FP-tree) 
structure to encode tr
 explicitly, which are usually expensive to generate.
 After the second step, the frequent itemsets can be extracted from the FP-tree.
 In `spark.mllib`, we implemented a parallel version of FP-growth called PFP,
-as described in [Li et al., PFP: Parallel FP-growth for query 
recommendation](http://dx.doi.org/10.1145/1454008.1454027).
+as described in [Li et al., PFP: Parallel FP-growth for query 
recommendation](https://doi.org/10.1145/1454008.1454027).
 PFP distributes the work of growing FP-trees based on the suffixes of 
transactions,
 and hence is more scalable than a single-machine implementation.
 We refer users to the papers for more details.
@@ -90,7 +90,7 @@ Refer to the [R API docs](api/R/spark.fpGrowth.html) for more 
details.
 
 PrefixSpan is a sequential pattern mining algorithm described in
 [Pei et al., Mining Sequential Patterns by Pattern-Growth: The
-PrefixSpan Approach](http://dx.doi.org/10.1109%2FTKDE.2004.77). We refer
+PrefixSpan Approach](https://doi.org/10.1109%2FTKDE.2004.77). We refer
 the reader to the referenced paper for formalizing the sequential
 pattern mining problem.
 
@@ -137,4 +137,4 @@ Refer to the [R API docs](api/R/spark.prefixSpan.html) for 
more details.
 {% include_example r/ml/prefixSpan.R %}
 </div>
 
-</div>
\ No newline at end of file
+</div>

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/docs/mllib-collaborative-filtering.md
----------------------------------------------------------------------
diff --git a/docs/mllib-collaborative-filtering.md 
b/docs/mllib-collaborative-filtering.md
index b230002..aeebb26 100644
--- a/docs/mllib-collaborative-filtering.md
+++ b/docs/mllib-collaborative-filtering.md
@@ -37,7 +37,7 @@ for example, users giving ratings to movies.
 
 It is common in many real-world use cases to only have access to *implicit 
feedback* (e.g. views,
 clicks, purchases, likes, shares etc.). The approach used in `spark.mllib` to 
deal with such data is taken
-from [Collaborative Filtering for Implicit Feedback 
Datasets](http://dx.doi.org/10.1109/ICDM.2008.22).
+from [Collaborative Filtering for Implicit Feedback 
Datasets](https://doi.org/10.1109/ICDM.2008.22).
 Essentially, instead of trying to model the matrix of ratings directly, this 
approach treats the data
 as numbers representing the *strength* in observations of user actions (such 
as the number of clicks,
 or the cumulative duration someone spent viewing a movie). Those numbers are 
then related to the level of
@@ -51,7 +51,7 @@ Since v1.1, we scale the regularization parameter `lambda` in 
solving each least
 the number of ratings the user generated in updating user factors,
 or the number of ratings the product received in updating product factors.
 This approach is named "ALS-WR" and discussed in the paper
-"[Large-Scale Parallel Collaborative Filtering for the Netflix 
Prize](http://dx.doi.org/10.1007/978-3-540-68880-8_32)".
+"[Large-Scale Parallel Collaborative Filtering for the Netflix 
Prize](https://doi.org/10.1007/978-3-540-68880-8_32)".
 It makes `lambda` less dependent on the scale of the dataset, so we can apply 
the
 best parameter learned from a sampled subset to the full dataset and expect 
similar performance.
 

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/docs/mllib-frequent-pattern-mining.md
----------------------------------------------------------------------
diff --git a/docs/mllib-frequent-pattern-mining.md 
b/docs/mllib-frequent-pattern-mining.md
index 0d3192c..8e45057 100644
--- a/docs/mllib-frequent-pattern-mining.md
+++ b/docs/mllib-frequent-pattern-mining.md
@@ -15,7 +15,7 @@ a popular algorithm to mining frequent itemsets.
 ## FP-growth
 
 The FP-growth algorithm is described in the paper
-[Han et al., Mining frequent patterns without candidate 
generation](http://dx.doi.org/10.1145/335191.335372),
+[Han et al., Mining frequent patterns without candidate 
generation](https://doi.org/10.1145/335191.335372),
 where "FP" stands for frequent pattern.
 Given a dataset of transactions, the first step of FP-growth is to calculate 
item frequencies and identify frequent items.
 Different from [Apriori-like](http://en.wikipedia.org/wiki/Apriori_algorithm) 
algorithms designed for the same purpose,
@@ -23,7 +23,7 @@ the second step of FP-growth uses a suffix tree (FP-tree) 
structure to encode tr
 explicitly, which are usually expensive to generate.
 After the second step, the frequent itemsets can be extracted from the FP-tree.
 In `spark.mllib`, we implemented a parallel version of FP-growth called PFP,
-as described in [Li et al., PFP: Parallel FP-growth for query 
recommendation](http://dx.doi.org/10.1145/1454008.1454027).
+as described in [Li et al., PFP: Parallel FP-growth for query 
recommendation](https://doi.org/10.1145/1454008.1454027).
 PFP distributes the work of growing FP-trees based on the suffixes of 
transactions,
 and hence more scalable than a single-machine implementation.
 We refer users to the papers for more details.
@@ -122,7 +122,7 @@ Refer to the [`AssociationRules` Java 
docs](api/java/org/apache/spark/mllib/fpm/
 
 PrefixSpan is a sequential pattern mining algorithm described in
 [Pei et al., Mining Sequential Patterns by Pattern-Growth: The
-PrefixSpan Approach](http://dx.doi.org/10.1109%2FTKDE.2004.77). We refer
+PrefixSpan Approach](https://doi.org/10.1109%2FTKDE.2004.77). We refer
 the reader to the referenced paper for formalizing the sequential
 pattern mining problem.
 

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/docs/mllib-isotonic-regression.md
----------------------------------------------------------------------
diff --git a/docs/mllib-isotonic-regression.md 
b/docs/mllib-isotonic-regression.md
index 99cab98..9964fce 100644
--- a/docs/mllib-isotonic-regression.md
+++ b/docs/mllib-isotonic-regression.md
@@ -24,9 +24,9 @@ Essentially isotonic regression is a
 best fitting the original data points.
 
 `spark.mllib` supports a
-[pool adjacent violators algorithm](http://doi.org/10.1198/TECH.2010.10111)
+[pool adjacent violators algorithm](https://doi.org/10.1198/TECH.2010.10111)
 which uses an approach to
-[parallelizing isotonic 
regression](http://doi.org/10.1007/978-3-642-99789-1_10).
+[parallelizing isotonic 
regression](https://doi.org/10.1007/978-3-642-99789-1_10).
 The training input is an RDD of tuples of three double values that represent
 label, feature and weight in this order. Additionally, IsotonicRegression 
algorithm has one
 optional parameter called $isotonic$ defaulting to true.

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala 
b/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
index 919496a..2eed84d 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
@@ -263,7 +263,7 @@ object KMeansModel extends MLReadable[KMeansModel] {
 /**
  * K-means clustering with support for k-means|| initialization proposed by 
Bahmani et al.
  *
- * @see <a href="http://dx.doi.org/10.14778/2180912.2180915";>Bahmani et al., 
Scalable k-means++.</a>
+ * @see <a href="https://doi.org/10.14778/2180912.2180915";>Bahmani et al., 
Scalable k-means++.</a>
  */
 @Since("1.5.0")
 class KMeans @Since("1.5.0") (

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala 
b/mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala
index 840a89b..7322815 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala
@@ -118,10 +118,10 @@ private[fpm] trait FPGrowthParams extends Params with 
HasPredictionCol {
 /**
  * :: Experimental ::
  * A parallel FP-growth algorithm to mine frequent itemsets. The algorithm is 
described in
- * <a href="http://dx.doi.org/10.1145/1454008.1454027";>Li et al., PFP: 
Parallel FP-Growth for Query
+ * <a href="https://doi.org/10.1145/1454008.1454027";>Li et al., PFP: Parallel 
FP-Growth for Query
  * Recommendation</a>. PFP distributes computation in such a way that each 
worker executes an
  * independent group of mining tasks. The FP-Growth algorithm is described in
- * <a href="http://dx.doi.org/10.1145/335191.335372";>Han et al., Mining 
frequent patterns without
+ * <a href="https://doi.org/10.1145/335191.335372";>Han et al., Mining frequent 
patterns without
  * candidate generation</a>. Note null values in the itemsCol column are 
ignored during fit().
  *
  * @see <a href="http://en.wikipedia.org/wiki/Association_rule_learning";>

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/mllib/src/main/scala/org/apache/spark/ml/fpm/PrefixSpan.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/ml/fpm/PrefixSpan.scala 
b/mllib/src/main/scala/org/apache/spark/ml/fpm/PrefixSpan.scala
index bd1c1a8..2a34135 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/fpm/PrefixSpan.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/fpm/PrefixSpan.scala
@@ -30,7 +30,7 @@ import org.apache.spark.sql.types.{ArrayType, LongType, 
StructField, StructType}
  * A parallel PrefixSpan algorithm to mine frequent sequential patterns.
  * The PrefixSpan algorithm is described in J. Pei, et al., PrefixSpan: Mining 
Sequential Patterns
  * Efficiently by Prefix-Projected Pattern Growth
- * (see <a href="http://doi.org/10.1109/ICDE.2001.914830";>here</a>).
+ * (see <a href="https://doi.org/10.1109/ICDE.2001.914830";>here</a>).
  * This class is not yet an Estimator/Transformer, use 
`findFrequentSequentialPatterns` method to
  * run the PrefixSpan algorithm.
  *

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala 
b/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
index ffe5927..50ef433 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
@@ -557,7 +557,7 @@ object ALSModel extends MLReadable[ALSModel] {
  *
  * For implicit preference data, the algorithm used is based on
  * "Collaborative Filtering for Implicit Feedback Datasets", available at
- * http://dx.doi.org/10.1109/ICDM.2008.22, adapted for the blocked approach 
used here.
+ * https://doi.org/10.1109/ICDM.2008.22, adapted for the blocked approach used 
here.
  *
  * Essentially instead of finding the low-rank approximations to the rating 
matrix `R`,
  * this finds the approximations for a preference matrix `P` where the 
elements of `P` are 1 if

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/mllib/src/main/scala/org/apache/spark/mllib/fpm/FPGrowth.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/fpm/FPGrowth.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/fpm/FPGrowth.scala
index 3a1bc35..519c1ea 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/fpm/FPGrowth.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/fpm/FPGrowth.scala
@@ -152,10 +152,10 @@ object FPGrowthModel extends Loader[FPGrowthModel[_]] {
 
 /**
  * A parallel FP-growth algorithm to mine frequent itemsets. The algorithm is 
described in
- * <a href="http://dx.doi.org/10.1145/1454008.1454027";>Li et al., PFP: 
Parallel FP-Growth for Query
+ * <a href="https://doi.org/10.1145/1454008.1454027";>Li et al., PFP: Parallel 
FP-Growth for Query
  * Recommendation</a>. PFP distributes computation in such a way that each 
worker executes an
  * independent group of mining tasks. The FP-Growth algorithm is described in
- * <a href="http://dx.doi.org/10.1145/335191.335372";>Han et al., Mining 
frequent patterns without
+ * <a href="https://doi.org/10.1145/335191.335372";>Han et al., Mining frequent 
patterns without
  * candidate generation</a>.
  *
  * @param minSupport the minimal support level of the frequent pattern, any 
pattern that appears

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/mllib/src/main/scala/org/apache/spark/mllib/fpm/PrefixSpan.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/fpm/PrefixSpan.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/fpm/PrefixSpan.scala
index 64d6a0b..b2c09b4 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/fpm/PrefixSpan.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/fpm/PrefixSpan.scala
@@ -45,7 +45,7 @@ import org.apache.spark.storage.StorageLevel
  * A parallel PrefixSpan algorithm to mine frequent sequential patterns.
  * The PrefixSpan algorithm is described in J. Pei, et al., PrefixSpan: Mining 
Sequential Patterns
  * Efficiently by Prefix-Projected Pattern Growth
- * (see <a href="http://doi.org/10.1109/ICDE.2001.914830";>here</a>).
+ * (see <a href="https://doi.org/10.1109/ICDE.2001.914830";>here</a>).
  *
  * @param minSupport the minimal support level of the sequential pattern, any 
pattern that appears
  *                   more than (minSupport * size-of-the-dataset) times will 
be output

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
----------------------------------------------------------------------
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
 
b/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
index 82ab716..c12b751 100644
--- 
a/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
@@ -540,7 +540,7 @@ class RowMatrix @Since("1.0.0") (
    * decomposition (factorization) for the [[RowMatrix]] of a tall and skinny 
shape.
    * Reference:
    *  Paul G. Constantine, David F. Gleich. "Tall and skinny QR factorizations 
in MapReduce
-   *  architectures" (see <a 
href="http://dx.doi.org/10.1145/1996092.1996103";>here</a>)
+   *  architectures" (see <a 
href="https://doi.org/10.1145/1996092.1996103";>here</a>)
    *
    * @param computeQ whether to computeQ
    * @return QRDecomposition(Q, R), Q = null if computeQ = false.

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala
----------------------------------------------------------------------
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala
index 1428822..12870f8 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala
@@ -54,7 +54,7 @@ case class Rating @Since("0.8.0") (
  *
  * For implicit preference data, the algorithm used is based on
  * "Collaborative Filtering for Implicit Feedback Datasets", available at
- * <a href="http://dx.doi.org/10.1109/ICDM.2008.22";>here</a>, adapted for the 
blocked approach
+ * <a href="https://doi.org/10.1109/ICDM.2008.22";>here</a>, adapted for the 
blocked approach
  * used here.
  *
  * Essentially instead of finding the low-rank approximations to the rating 
matrix `R`,

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/python/pyspark/ml/fpm.py
----------------------------------------------------------------------
diff --git a/python/pyspark/ml/fpm.py b/python/pyspark/ml/fpm.py
index 886ad84..734763e 100644
--- a/python/pyspark/ml/fpm.py
+++ b/python/pyspark/ml/fpm.py
@@ -167,8 +167,8 @@ class FPGrowth(JavaEstimator, HasItemsCol, HasPredictionCol,
     independent group of mining tasks. The FP-Growth algorithm is described in
     Han et al., Mining frequent patterns without candidate generation 
[HAN2000]_
 
-    .. [LI2008] http://dx.doi.org/10.1145/1454008.1454027
-    .. [HAN2000] http://dx.doi.org/10.1145/335191.335372
+    .. [LI2008] https://doi.org/10.1145/1454008.1454027
+    .. [HAN2000] https://doi.org/10.1145/335191.335372
 
     .. note:: null values in the feature column are ignored during fit().
     .. note:: Internally `transform` `collects` and `broadcasts` association 
rules.
@@ -254,7 +254,7 @@ class PrefixSpan(JavaParams):
     A parallel PrefixSpan algorithm to mine frequent sequential patterns.
     The PrefixSpan algorithm is described in J. Pei, et al., PrefixSpan: 
Mining Sequential Patterns
     Efficiently by Prefix-Projected Pattern Growth
-    (see <a href="http://doi.org/10.1109/ICDE.2001.914830";>here</a>).
+    (see <a href="https://doi.org/10.1109/ICDE.2001.914830";>here</a>).
     This class is not yet an Estimator/Transformer, use 
:py:func:`findFrequentSequentialPatterns`
     method to run the PrefixSpan algorithm.
 

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/python/pyspark/ml/recommendation.py
----------------------------------------------------------------------
diff --git a/python/pyspark/ml/recommendation.py 
b/python/pyspark/ml/recommendation.py
index a8eae9b..520d791 100644
--- a/python/pyspark/ml/recommendation.py
+++ b/python/pyspark/ml/recommendation.py
@@ -57,7 +57,7 @@ class ALS(JavaEstimator, HasCheckpointInterval, HasMaxIter, 
HasPredictionCol, Ha
 
     For implicit preference data, the algorithm used is based on
     `"Collaborative Filtering for Implicit Feedback Datasets",
-    <http://dx.doi.org/10.1109/ICDM.2008.22>`_, adapted for the blocked
+    <https://doi.org/10.1109/ICDM.2008.22>`_, adapted for the blocked
     approach used here.
 
     Essentially instead of finding the low-rank approximations to the

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/python/pyspark/mllib/fpm.py
----------------------------------------------------------------------
diff --git a/python/pyspark/mllib/fpm.py b/python/pyspark/mllib/fpm.py
index de18dad..6accb9b 100644
--- a/python/pyspark/mllib/fpm.py
+++ b/python/pyspark/mllib/fpm.py
@@ -132,7 +132,7 @@ class PrefixSpan(object):
     A parallel PrefixSpan algorithm to mine frequent sequential patterns.
     The PrefixSpan algorithm is described in J. Pei, et al., PrefixSpan:
     Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth
-    ([[http://doi.org/10.1109/ICDE.2001.914830]]).
+    ([[https://doi.org/10.1109/ICDE.2001.914830]]).
 
     .. versionadded:: 1.6.0
     """

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/python/pyspark/mllib/linalg/distributed.py
----------------------------------------------------------------------
diff --git a/python/pyspark/mllib/linalg/distributed.py 
b/python/pyspark/mllib/linalg/distributed.py
index 7e8b150..b7f0978 100644
--- a/python/pyspark/mllib/linalg/distributed.py
+++ b/python/pyspark/mllib/linalg/distributed.py
@@ -270,7 +270,7 @@ class RowMatrix(DistributedMatrix):
         Reference:
          Paul G. Constantine, David F. Gleich. "Tall and skinny QR
          factorizations in MapReduce architectures"
-         ([[http://dx.doi.org/10.1145/1996092.1996103]])
+         ([[https://doi.org/10.1145/1996092.1996103]])
 
         :param: computeQ: whether to computeQ
         :return: QRDecomposition(Q: RowMatrix, R: Matrix), where

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/python/pyspark/rdd.py
----------------------------------------------------------------------
diff --git a/python/pyspark/rdd.py b/python/pyspark/rdd.py
index ccf39e1..8bd6897 100644
--- a/python/pyspark/rdd.py
+++ b/python/pyspark/rdd.py
@@ -2354,7 +2354,7 @@ class RDD(object):
         The algorithm used is based on streamlib's implementation of
         `"HyperLogLog in Practice: Algorithmic Engineering of a State
         of The Art Cardinality Estimation Algorithm", available here
-        <http://dx.doi.org/10.1145/2452376.2452456>`_.
+        <https://doi.org/10.1145/2452376.2452456>`_.
 
         :param relativeSD: Relative accuracy. Smaller values create
                            counters that require more space.

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/python/pyspark/sql/dataframe.py
----------------------------------------------------------------------
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index c4f4d81..4abbeac 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -1806,7 +1806,7 @@ class DataFrame(object):
 
         This method implements a variation of the Greenwald-Khanna
         algorithm (with some speed optimizations). The algorithm was first
-        present in [[http://dx.doi.org/10.1145/375663.375670
+        present in [[https://doi.org/10.1145/375663.375670
         Space-efficient Online Computation of Quantile Summaries]]
         by Greenwald and Khanna.
 
@@ -1928,7 +1928,7 @@ class DataFrame(object):
         """
         Finding frequent items for columns, possibly with false positives. 
Using the
         frequent element count algorithm described in
-        "http://dx.doi.org/10.1145/762471.762473, proposed by Karp, Schenker, 
and Papadimitriou".
+        "https://doi.org/10.1145/762471.762473, proposed by Karp, Schenker, 
and Papadimitriou".
         :func:`DataFrame.freqItems` and 
:func:`DataFrameStatFunctions.freqItems` are aliases.
 
         .. note:: This function is meant for exploratory data analysis, as we 
make no

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/QuantileSummaries.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/QuantileSummaries.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/QuantileSummaries.scala
index 3190e51..2a03f85 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/QuantileSummaries.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/QuantileSummaries.scala
@@ -25,7 +25,7 @@ import 
org.apache.spark.sql.catalyst.util.QuantileSummaries.Stats
  * Helper class to compute approximate quantile summary.
  * This implementation is based on the algorithm proposed in the paper:
  * "Space-efficient Online Computation of Quantile Summaries" by Greenwald, 
Michael
- * and Khanna, Sanjeev. (http://dx.doi.org/10.1145/375663.375670)
+ * and Khanna, Sanjeev. (https://doi.org/10.1145/375663.375670)
  *
  * In order to optimize for speed, it maintains an internal buffer of the last 
seen samples,
  * and only inserts them after crossing a certain size threshold. This 
guarantees a near-constant

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala
----------------------------------------------------------------------
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala
index b2f6a6b..0b22b89 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala
@@ -51,7 +51,7 @@ final class DataFrameStatFunctions private[sql](df: 
DataFrame) {
    *
    * This method implements a variation of the Greenwald-Khanna algorithm 
(with some speed
    * optimizations).
-   * The algorithm was first present in <a 
href="http://dx.doi.org/10.1145/375663.375670";>
+   * The algorithm was first present in <a 
href="https://doi.org/10.1145/375663.375670";>
    * Space-efficient Online Computation of Quantile Summaries</a> by Greenwald 
and Khanna.
    *
    * @param col the name of the numerical column
@@ -218,7 +218,7 @@ final class DataFrameStatFunctions private[sql](df: 
DataFrame) {
   /**
    * Finding frequent items for columns, possibly with false positives. Using 
the
    * frequent element count algorithm described in
-   * <a href="http://dx.doi.org/10.1145/762471.762473";>here</a>, proposed by 
Karp,
+   * <a href="https://doi.org/10.1145/762471.762473";>here</a>, proposed by 
Karp,
    * Schenker, and Papadimitriou.
    * The `support` should be greater than 1e-4.
    *
@@ -265,7 +265,7 @@ final class DataFrameStatFunctions private[sql](df: 
DataFrame) {
   /**
    * Finding frequent items for columns, possibly with false positives. Using 
the
    * frequent element count algorithm described in
-   * <a href="http://dx.doi.org/10.1145/762471.762473";>here</a>, proposed by 
Karp,
+   * <a href="https://doi.org/10.1145/762471.762473";>here</a>, proposed by 
Karp,
    * Schenker, and Papadimitriou.
    * Uses a `default` support of 1%.
    *
@@ -284,7 +284,7 @@ final class DataFrameStatFunctions private[sql](df: 
DataFrame) {
   /**
    * (Scala-specific) Finding frequent items for columns, possibly with false 
positives. Using the
    * frequent element count algorithm described in
-   * <a href="http://dx.doi.org/10.1145/762471.762473";>here</a>, proposed by 
Karp, Schenker,
+   * <a href="https://doi.org/10.1145/762471.762473";>here</a>, proposed by 
Karp, Schenker,
    * and Papadimitriou.
    *
    * This function is meant for exploratory data analysis, as we make no 
guarantee about the
@@ -328,7 +328,7 @@ final class DataFrameStatFunctions private[sql](df: 
DataFrame) {
   /**
    * (Scala-specific) Finding frequent items for columns, possibly with false 
positives. Using the
    * frequent element count algorithm described in
-   * <a href="http://dx.doi.org/10.1145/762471.762473";>here</a>, proposed by 
Karp, Schenker,
+   * <a href="https://doi.org/10.1145/762471.762473";>here</a>, proposed by 
Karp, Schenker,
    * and Papadimitriou.
    * Uses a `default` support of 1%.
    *

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala
----------------------------------------------------------------------
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala
index 86f6307..420faa6 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala
@@ -69,7 +69,7 @@ object FrequentItems extends Logging {
   /**
    * Finding frequent items for columns, possibly with false positives. Using 
the
    * frequent element count algorithm described in
-   * <a href="http://dx.doi.org/10.1145/762471.762473";>here</a>, proposed by 
Karp, Schenker,
+   * <a href="https://doi.org/10.1145/762471.762473";>here</a>, proposed by 
Karp, Schenker,
    * and Papadimitriou.
    * The `support` should be greater than 1e-4.
    * For Internal use only.

http://git-wip-us.apache.org/repos/asf/spark/blob/c5daccb1/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala
----------------------------------------------------------------------
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala
index bea652c..ac25a8f 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/stat/StatFunctions.scala
@@ -45,7 +45,7 @@ object StatFunctions extends Logging {
    *
    * This method implements a variation of the Greenwald-Khanna algorithm 
(with some speed
    * optimizations).
-   * The algorithm was first present in <a 
href="http://dx.doi.org/10.1145/375663.375670";>
+   * The algorithm was first present in <a 
href="https://doi.org/10.1145/375663.375670";>
    * Space-efficient Online Computation of Quantile Summaries</a> by Greenwald 
and Khanna.
    *
    * @param df the dataframe


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to