This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.3 by this push:
new 76b7ea25e15 [SPARK-40322][DOCS][3.3] Fix all dead links in the docs
76b7ea25e15 is described below
commit 76b7ea25e155b1786ebc3d82ecebe2f37e926223
Author: Yuming Wang <[email protected]>
AuthorDate: Sat Sep 24 17:12:16 2022 +0900
[SPARK-40322][DOCS][3.3] Fix all dead links in the docs
This PR backports https://github.com/apache/spark/pull/37981 to branch-3.3.
The original PR description:
### What changes were proposed in this pull request?
This PR fixes any dead links in the documentation.
### Why are the changes needed?
Correct the document.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
manual test.
Closes #37984 from wangyum/branch-3.3-SPARK-40322.
Authored-by: Yuming Wang <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
docs/ml-classification-regression.md | 34 +++++++--------
docs/ml-clustering.md | 10 ++---
docs/ml-collaborative-filtering.md | 2 +-
docs/ml-frequent-pattern-mining.md | 4 +-
docs/rdd-programming-guide.md | 4 +-
docs/sparkr.md | 48 +++++++++++-----------
docs/sql-getting-started.md | 10 ++---
docs/structured-streaming-programming-guide.md | 16 ++++----
.../source/getting_started/quickstart_ps.ipynb | 2 +-
9 files changed, 65 insertions(+), 65 deletions(-)
diff --git a/docs/ml-classification-regression.md
b/docs/ml-classification-regression.md
index bad74cbcf6c..c3e1b6b4390 100644
--- a/docs/ml-classification-regression.md
+++ b/docs/ml-classification-regression.md
@@ -92,7 +92,7 @@ More details on parameters can be found in the [Python API
documentation](api/py
<div data-lang="r" markdown="1">
-More details on parameters can be found in the [R API
documentation](api/R/spark.logit.html).
+More details on parameters can be found in the [R API
documentation](api/R/reference/spark.logit.html).
{% include_example binomial r/ml/logit.R %}
</div>
@@ -195,7 +195,7 @@ training summary for evaluating the model.
<div data-lang="r" markdown="1">
-More details on parameters can be found in the [R API
documentation](api/R/spark.logit.html).
+More details on parameters can be found in the [R API
documentation](api/R/reference/spark.logit.html).
{% include_example multinomial r/ml/logit.R %}
</div>
@@ -240,7 +240,7 @@ More details on parameters can be found in the [Python API
documentation](api/py
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.decisionTree.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.decisionTree.html) for more
details.
{% include_example classification r/ml/decisionTree.R %}
@@ -282,7 +282,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.classificatio
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.randomForest.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.randomForest.html) for more
details.
{% include_example classification r/ml/randomForest.R %}
</div>
@@ -323,7 +323,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.classificatio
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.gbt.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.gbt.html) for more details.
{% include_example classification r/ml/gbt.R %}
</div>
@@ -379,7 +379,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.classificatio
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.mlp.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.mlp.html) for more details.
{% include_example r/ml/mlp.R %}
</div>
@@ -424,7 +424,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.classificatio
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.svmLinear.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.svmLinear.html) for more
details.
{% include_example r/ml/svmLinear.R %}
</div>
@@ -522,7 +522,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.classificatio
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.naiveBayes.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.naiveBayes.html) for more
details.
{% include_example r/ml/naiveBayes.R %}
</div>
@@ -565,7 +565,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.classificatio
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.fmClassifier.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.fmClassifier.html) for more
details.
Note: At the moment SparkR doesn't support feature scaling.
@@ -616,7 +616,7 @@ More details on parameters can be found in the [Python API
documentation](api/py
<div data-lang="r" markdown="1">
-More details on parameters can be found in the [R API
documentation](api/R/spark.lm.html).
+More details on parameters can be found in the [R API
documentation](api/R/reference/spark.lm.html).
{% include_example r/ml/lm_with_elastic_net.R %}
</div>
@@ -763,7 +763,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.regression.Ge
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.glm.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.glm.html) for more details.
{% include_example r/ml/glm.R %}
</div>
@@ -805,7 +805,7 @@ More details on parameters can be found in the [Python API
documentation](api/py
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.decisionTree.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.decisionTree.html) for more
details.
{% include_example regression r/ml/decisionTree.R %}
</div>
@@ -847,7 +847,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.regression.Ra
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.randomForest.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.randomForest.html) for more
details.
{% include_example regression r/ml/randomForest.R %}
</div>
@@ -888,7 +888,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.regression.GB
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.gbt.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.gbt.html) for more details.
{% include_example regression r/ml/gbt.R %}
</div>
@@ -982,7 +982,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.regression.AF
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.survreg.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.survreg.html) for more details.
{% include_example r/ml/survreg.R %}
</div>
@@ -1060,7 +1060,7 @@ Refer to the [`IsotonicRegression` Python
docs](api/python/reference/api/pyspark
<div data-lang="r" markdown="1">
-Refer to the [`IsotonicRegression` R API docs](api/R/spark.isoreg.html) for
more details on the API.
+Refer to the [`IsotonicRegression` R API
docs](api/R/reference/spark.isoreg.html) for more details on the API.
{% include_example r/ml/isoreg.R %}
</div>
@@ -1103,7 +1103,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.regression.FM
<div data-lang="r" markdown="1">
-Refer to the [R API documentation](api/R/spark.fmRegressor.html) for more
details.
+Refer to the [R API documentation](api/R/reference/spark.fmRegressor.html) for
more details.
Note: At the moment SparkR doesn't support feature scaling.
diff --git a/docs/ml-clustering.md b/docs/ml-clustering.md
index f478776196d..1d15f61a29d 100644
--- a/docs/ml-clustering.md
+++ b/docs/ml-clustering.md
@@ -104,7 +104,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.clustering.KM
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.kmeans.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.kmeans.html) for more details.
{% include_example r/ml/kmeans.R %}
</div>
@@ -144,7 +144,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.clustering.LD
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.lda.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.lda.html) for more details.
{% include_example r/ml/lda.R %}
</div>
@@ -185,7 +185,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.clustering.Bi
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.bisectingKmeans.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.bisectingKmeans.html) for more
details.
{% include_example r/ml/bisectingKmeans.R %}
</div>
@@ -274,7 +274,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.clustering.Ga
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.gaussianMixture.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.gaussianMixture.html) for more
details.
{% include_example r/ml/gaussianMixture.R %}
</div>
@@ -321,7 +321,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.clustering.Po
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.powerIterationClustering.html) for more
details.
+Refer to the [R API docs](api/R/reference/spark.powerIterationClustering.html)
for more details.
{% include_example r/ml/powerIterationClustering.R %}
</div>
diff --git a/docs/ml-collaborative-filtering.md
b/docs/ml-collaborative-filtering.md
index ddc90406648..8b6d2a1d14c 100644
--- a/docs/ml-collaborative-filtering.md
+++ b/docs/ml-collaborative-filtering.md
@@ -195,7 +195,7 @@ als = ALS(maxIter=5, regParam=0.01, implicitPrefs=True,
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.als.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.als.html) for more details.
{% include_example r/ml/als.R %}
</div>
diff --git a/docs/ml-frequent-pattern-mining.md
b/docs/ml-frequent-pattern-mining.md
index 6e6ae410cb7..58cd29fd8f6 100644
--- a/docs/ml-frequent-pattern-mining.md
+++ b/docs/ml-frequent-pattern-mining.md
@@ -102,7 +102,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.fpm.FPGrowth.
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.fpGrowth.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.fpGrowth.html) for more
details.
{% include_example r/ml/fpm.R %}
</div>
@@ -155,7 +155,7 @@ Refer to the [Python API
docs](api/python/reference/api/pyspark.ml.fpm.PrefixSpa
<div data-lang="r" markdown="1">
-Refer to the [R API docs](api/R/spark.prefixSpan.html) for more details.
+Refer to the [R API docs](api/R/reference/spark.prefixSpan.html) for more
details.
{% include_example r/ml/prefixSpan.R %}
</div>
diff --git a/docs/rdd-programming-guide.md b/docs/rdd-programming-guide.md
index 4234eb6365f..7e4664f2a0e 100644
--- a/docs/rdd-programming-guide.md
+++ b/docs/rdd-programming-guide.md
@@ -950,7 +950,7 @@ RDD API doc
([Scala](api/scala/org/apache/spark/rdd/RDD.html),
[Java](api/java/index.html?org/apache/spark/api/java/JavaRDD.html),
[Python](api/python/reference/api/pyspark.RDD.html#pyspark.RDD),
- [R](api/R/index.html))
+ [R](api/R/reference/index.html))
and pair RDD functions doc
([Scala](api/scala/org/apache/spark/rdd/PairRDDFunctions.html),
[Java](api/java/index.html?org/apache/spark/api/java/JavaPairRDD.html))
@@ -1064,7 +1064,7 @@ RDD API doc
([Scala](api/scala/org/apache/spark/rdd/RDD.html),
[Java](api/java/index.html?org/apache/spark/api/java/JavaRDD.html),
[Python](api/python/reference/api/pyspark.RDD.html#pyspark.RDD),
- [R](api/R/index.html))
+ [R](api/R/reference/index.html))
and pair RDD functions doc
([Scala](api/scala/org/apache/spark/rdd/PairRDDFunctions.html),
diff --git a/docs/sparkr.md b/docs/sparkr.md
index 002da5a56fa..2e55c7a20c0 100644
--- a/docs/sparkr.md
+++ b/docs/sparkr.md
@@ -175,7 +175,7 @@ people <-
read.json(c("./examples/src/main/resources/people.json", "./examples/s
{% endhighlight %}
</div>
-The data sources API natively supports CSV formatted input files. For more
information please refer to SparkR [read.df](api/R/read.df.html) API
documentation.
+The data sources API natively supports CSV formatted input files. For more
information please refer to SparkR [read.df](api/R/reference/read.df.html) API
documentation.
<div data-lang="r" markdown="1">
{% highlight r %}
@@ -536,49 +536,49 @@ SparkR supports the following machine learning algorithms
currently:
#### Classification
-* [`spark.logit`](api/R/spark.logit.html): [`Logistic
Regression`](ml-classification-regression.html#logistic-regression)
-* [`spark.mlp`](api/R/spark.mlp.html): [`Multilayer Perceptron
(MLP)`](ml-classification-regression.html#multilayer-perceptron-classifier)
-* [`spark.naiveBayes`](api/R/spark.naiveBayes.html): [`Naive
Bayes`](ml-classification-regression.html#naive-bayes)
-* [`spark.svmLinear`](api/R/spark.svmLinear.html): [`Linear Support Vector
Machine`](ml-classification-regression.html#linear-support-vector-machine)
-* [`spark.fmClassifier`](api/R/fmClassifier.html): [`Factorization Machines
classifier`](ml-classification-regression.html#factorization-machines-classifier)
+* [`spark.logit`](api/R/reference/spark.logit.html): [`Logistic
Regression`](ml-classification-regression.html#logistic-regression)
+* [`spark.mlp`](api/R/reference/spark.mlp.html): [`Multilayer Perceptron
(MLP)`](ml-classification-regression.html#multilayer-perceptron-classifier)
+* [`spark.naiveBayes`](api/R/reference/spark.naiveBayes.html): [`Naive
Bayes`](ml-classification-regression.html#naive-bayes)
+* [`spark.svmLinear`](api/R/reference/spark.svmLinear.html): [`Linear Support
Vector
Machine`](ml-classification-regression.html#linear-support-vector-machine)
+* [`spark.fmClassifier`](api/R/reference/fmClassifier.html): [`Factorization
Machines
classifier`](ml-classification-regression.html#factorization-machines-classifier)
#### Regression
-* [`spark.survreg`](api/R/spark.survreg.html): [`Accelerated Failure Time
(AFT) Survival Model`](ml-classification-regression.html#survival-regression)
-* [`spark.glm`](api/R/spark.glm.html) or [`glm`](api/R/glm.html):
[`Generalized Linear Model
(GLM)`](ml-classification-regression.html#generalized-linear-regression)
-* [`spark.isoreg`](api/R/spark.isoreg.html): [`Isotonic
Regression`](ml-classification-regression.html#isotonic-regression)
-* [`spark.lm`](api/R/spark.lm.html): [`Linear
Regression`](ml-classification-regression.html#linear-regression)
-* [`spark.fmRegressor`](api/R/spark.fmRegressor.html): [`Factorization
Machines
regressor`](ml-classification-regression.html#factorization-machines-regressor)
+* [`spark.survreg`](api/R/reference/spark.survreg.html): [`Accelerated Failure
Time (AFT) Survival
Model`](ml-classification-regression.html#survival-regression)
+* [`spark.glm`](api/R/reference/spark.glm.html) or
[`glm`](api/R/reference/glm.html): [`Generalized Linear Model
(GLM)`](ml-classification-regression.html#generalized-linear-regression)
+* [`spark.isoreg`](api/R/reference/spark.isoreg.html): [`Isotonic
Regression`](ml-classification-regression.html#isotonic-regression)
+* [`spark.lm`](api/R/reference/spark.lm.html): [`Linear
Regression`](ml-classification-regression.html#linear-regression)
+* [`spark.fmRegressor`](api/R/reference/spark.fmRegressor.html):
[`Factorization Machines
regressor`](ml-classification-regression.html#factorization-machines-regressor)
#### Tree
-* [`spark.decisionTree`](api/R/spark.decisionTree.html): `Decision Tree for`
[`Regression`](ml-classification-regression.html#decision-tree-regression)
`and`
[`Classification`](ml-classification-regression.html#decision-tree-classifier)
-* [`spark.gbt`](api/R/spark.gbt.html): `Gradient Boosted Trees for`
[`Regression`](ml-classification-regression.html#gradient-boosted-tree-regression)
`and`
[`Classification`](ml-classification-regression.html#gradient-boosted-tree-classifier)
-* [`spark.randomForest`](api/R/spark.randomForest.html): `Random Forest for`
[`Regression`](ml-classification-regression.html#random-forest-regression)
`and`
[`Classification`](ml-classification-regression.html#random-forest-classifier)
+* [`spark.decisionTree`](api/R/reference/spark.decisionTree.html): `Decision
Tree for`
[`Regression`](ml-classification-regression.html#decision-tree-regression)
`and`
[`Classification`](ml-classification-regression.html#decision-tree-classifier)
+* [`spark.gbt`](api/R/reference/spark.gbt.html): `Gradient Boosted Trees for`
[`Regression`](ml-classification-regression.html#gradient-boosted-tree-regression)
`and`
[`Classification`](ml-classification-regression.html#gradient-boosted-tree-classifier)
+* [`spark.randomForest`](api/R/reference/spark.randomForest.html): `Random
Forest for`
[`Regression`](ml-classification-regression.html#random-forest-regression)
`and`
[`Classification`](ml-classification-regression.html#random-forest-classifier)
#### Clustering
-* [`spark.bisectingKmeans`](api/R/spark.bisectingKmeans.html): [`Bisecting
k-means`](ml-clustering.html#bisecting-k-means)
-* [`spark.gaussianMixture`](api/R/spark.gaussianMixture.html): [`Gaussian
Mixture Model (GMM)`](ml-clustering.html#gaussian-mixture-model-gmm)
-* [`spark.kmeans`](api/R/spark.kmeans.html):
[`K-Means`](ml-clustering.html#k-means)
-* [`spark.lda`](api/R/spark.lda.html): [`Latent Dirichlet Allocation
(LDA)`](ml-clustering.html#latent-dirichlet-allocation-lda)
-* [`spark.powerIterationClustering
(PIC)`](api/R/spark.powerIterationClustering.html): [`Power Iteration
Clustering (PIC)`](ml-clustering.html#power-iteration-clustering-pic)
+* [`spark.bisectingKmeans`](api/R/reference/spark.bisectingKmeans.html):
[`Bisecting k-means`](ml-clustering.html#bisecting-k-means)
+* [`spark.gaussianMixture`](api/R/reference/spark.gaussianMixture.html):
[`Gaussian Mixture Model (GMM)`](ml-clustering.html#gaussian-mixture-model-gmm)
+* [`spark.kmeans`](api/R/reference/spark.kmeans.html):
[`K-Means`](ml-clustering.html#k-means)
+* [`spark.lda`](api/R/reference/spark.lda.html): [`Latent Dirichlet Allocation
(LDA)`](ml-clustering.html#latent-dirichlet-allocation-lda)
+* [`spark.powerIterationClustering
(PIC)`](api/R/reference/spark.powerIterationClustering.html): [`Power Iteration
Clustering (PIC)`](ml-clustering.html#power-iteration-clustering-pic)
#### Collaborative Filtering
-* [`spark.als`](api/R/spark.als.html): [`Alternating Least Squares
(ALS)`](ml-collaborative-filtering.html#collaborative-filtering)
+* [`spark.als`](api/R/reference/spark.als.html): [`Alternating Least Squares
(ALS)`](ml-collaborative-filtering.html#collaborative-filtering)
#### Frequent Pattern Mining
-* [`spark.fpGrowth`](api/R/spark.fpGrowth.html) :
[`FP-growth`](ml-frequent-pattern-mining.html#fp-growth)
-* [`spark.prefixSpan`](api/R/spark.prefixSpan.html) :
[`PrefixSpan`](ml-frequent-pattern-mining.html#prefixSpan)
+* [`spark.fpGrowth`](api/R/reference/spark.fpGrowth.html) :
[`FP-growth`](ml-frequent-pattern-mining.html#fp-growth)
+* [`spark.prefixSpan`](api/R/reference/spark.prefixSpan.html) :
[`PrefixSpan`](ml-frequent-pattern-mining.html#prefixSpan)
#### Statistics
-* [`spark.kstest`](api/R/spark.kstest.html): `Kolmogorov-Smirnov Test`
+* [`spark.kstest`](api/R/reference/spark.kstest.html): `Kolmogorov-Smirnov
Test`
Under the hood, SparkR uses MLlib to train the model. Please refer to the
corresponding section of MLlib user guide for example code.
-Users can call `summary` to print a summary of the fitted model,
[predict](api/R/predict.html) to make predictions on new data, and
[write.ml](api/R/write.ml.html)/[read.ml](api/R/read.ml.html) to save/load
fitted models.
+Users can call `summary` to print a summary of the fitted model,
[predict](api/R/reference/predict.html) to make predictions on new data, and
[write.ml](api/R/reference/write.ml.html)/[read.ml](api/R/reference/read.ml.html)
to save/load fitted models.
SparkR supports a subset of the available R formula operators for model
fitting, including ‘~’, ‘.’, ‘:’, ‘+’, and ‘-‘.
diff --git a/docs/sql-getting-started.md b/docs/sql-getting-started.md
index 2403d7b2a6c..69396924e35 100644
--- a/docs/sql-getting-started.md
+++ b/docs/sql-getting-started.md
@@ -41,14 +41,14 @@ The entry point into all functionality in Spark is the
[`SparkSession`](api/java
<div data-lang="python" markdown="1">
-The entry point into all functionality in Spark is the
[`SparkSession`](api/python/reference/api/pyspark.sql.SparkSession.html) class.
To create a basic `SparkSession`, just use `SparkSession.builder`:
+The entry point into all functionality in Spark is the
[`SparkSession`](api/python/reference/pyspark.sql/api/pyspark.sql.SparkSession.html)
class. To create a basic `SparkSession`, just use `SparkSession.builder`:
{% include_example init_session python/sql/basic.py %}
</div>
<div data-lang="r" markdown="1">
-The entry point into all functionality in Spark is the
[`SparkSession`](api/R/sparkR.session.html) class. To initialize a basic
`SparkSession`, just call `sparkR.session()`:
+The entry point into all functionality in Spark is the
[`SparkSession`](api/R/reference/sparkR.session.html) class. To initialize a
basic `SparkSession`, just call `sparkR.session()`:
{% include_example init_session r/RSparkSQLExample.R %}
@@ -104,7 +104,7 @@ As an example, the following creates a DataFrame based on
the content of a JSON
## Untyped Dataset Operations (aka DataFrame Operations)
-DataFrames provide a domain-specific language for structured data manipulation
in [Scala](api/scala/org/apache/spark/sql/Dataset.html),
[Java](api/java/index.html?org/apache/spark/sql/Dataset.html),
[Python](api/python/reference/api/pyspark.sql.DataFrame.html) and
[R](api/R/SparkDataFrame.html).
+DataFrames provide a domain-specific language for structured data manipulation
in [Scala](api/scala/org/apache/spark/sql/Dataset.html),
[Java](api/java/index.html?org/apache/spark/sql/Dataset.html),
[Python](api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.html) and
[R](api/R/reference/SparkDataFrame.html).
As mentioned above, in Spark 2.0, DataFrames are just Dataset of `Row`s in
Scala and Java API. These operations are also referred as "untyped
transformations" in contrast to "typed transformations" come with strongly
typed Scala/Java Datasets.
@@ -146,9 +146,9 @@ In addition to simple column references and expressions,
DataFrames also have a
{% include_example untyped_ops r/RSparkSQLExample.R %}
-For a complete list of the types of operations that can be performed on a
DataFrame refer to the [API Documentation](api/R/index.html).
+For a complete list of the types of operations that can be performed on a
DataFrame refer to the [API Documentation](api/R/reference/index.html).
-In addition to simple column references and expressions, DataFrames also have
a rich library of functions including string manipulation, date arithmetic,
common math operations and more. The complete list is available in the
[DataFrame Function Reference](api/R/SparkDataFrame.html).
+In addition to simple column references and expressions, DataFrames also have
a rich library of functions including string manipulation, date arithmetic,
common math operations and more. The complete list is available in the
[DataFrame Function Reference](api/R/reference/SparkDataFrame.html).
</div>
diff --git a/docs/structured-streaming-programming-guide.md
b/docs/structured-streaming-programming-guide.md
index 2db4f92842c..48ff4b767cc 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -498,14 +498,14 @@ to track the read position in the stream. The engine uses
checkpointing and writ
# API using Datasets and DataFrames
Since Spark 2.0, DataFrames and Datasets can represent static, bounded data,
as well as streaming, unbounded data. Similar to static Datasets/DataFrames,
you can use the common entry point `SparkSession`
-([Scala](api/scala/org/apache/spark/sql/SparkSession.html)/[Java](api/java/org/apache/spark/sql/SparkSession.html)/[Python](api/python/reference/api/pyspark.sql.SparkSession.html#pyspark.sql.SparkSession)/[R](api/R/sparkR.session.html)
docs)
+([Scala](api/scala/org/apache/spark/sql/SparkSession.html)/[Java](api/java/org/apache/spark/sql/SparkSession.html)/[Python](api/python/reference/pyspark.sql/api/pyspark.sql.SparkSession.html#pyspark.sql.SparkSession)/[R](api/R/reference/sparkR.session.html)
docs)
to create streaming DataFrames/Datasets from streaming sources, and apply the
same operations on them as static DataFrames/Datasets. If you are not familiar
with Datasets/DataFrames, you are strongly advised to familiarize yourself with
them using the
[DataFrame/Dataset Programming Guide](sql-programming-guide.html).
## Creating streaming DataFrames and streaming Datasets
Streaming DataFrames can be created through the `DataStreamReader` interface
-([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamReader.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamReader.html)/[Python](api/python/reference/api/pyspark.sql.streaming.DataStreamReader.html#pyspark.sql.streaming.DataStreamReader)
docs)
-returned by `SparkSession.readStream()`. In [R](api/R/read.stream.html), with
the `read.stream()` method. Similar to the read interface for creating static
DataFrame, you can specify the details of the source – data format, schema,
options, etc.
+([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamReader.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamReader.html)/[Python](api/python/reference/pyspark.ss/api/pyspark.sql.streaming.DataStreamReader.html#pyspark.sql.streaming.DataStreamReader)
docs)
+returned by `SparkSession.readStream()`. In
[R](api/R/reference/read.stream.html), with the `read.stream()` method. Similar
to the read interface for creating static DataFrame, you can specify the
details of the source – data format, schema, options, etc.
#### Input Sources
There are a few built-in sources.
@@ -560,7 +560,7 @@ Here are the details of all the sources in Spark.
NOTE 3: Both delete and move actions are best effort. Failing to
delete or move files will not fail the streaming query. Spark may not clean up
some source files in some circumstances - e.g. the application doesn't shut
down gracefully, too many files are queued to clean up.
<br/><br/>
For file-format-specific options, see the related methods in
<code>DataStreamReader</code>
- (<a
href="api/scala/org/apache/spark/sql/streaming/DataStreamReader.html">Scala</a>/<a
href="api/java/org/apache/spark/sql/streaming/DataStreamReader.html">Java</a>/<a
href="api/python/reference/api/pyspark.sql.streaming.DataStreamReader.html#pyspark.sql.streaming.DataStreamReader">Python</a>/<a
+ (<a
href="api/scala/org/apache/spark/sql/streaming/DataStreamReader.html">Scala</a>/<a
href="api/java/org/apache/spark/sql/streaming/DataStreamReader.html">Java</a>/<a
href="api/python/reference/pyspark.sql/api/pyspark.sql.streaming.DataStreamReader.html#pyspark.sql.streaming.DataStreamReader">Python</a>/<a
href="api/R/read.stream.html">R</a>).
E.g. for "parquet" format options see
<code>DataStreamReader.parquet()</code>.
<br/><br/>
@@ -2003,7 +2003,7 @@ User can increase Spark locality waiting configurations
to avoid loading state s
## Starting Streaming Queries
Once you have defined the final result DataFrame/Dataset, all that is left is
for you to start the streaming computation. To do that, you have to use the
`DataStreamWriter`
-([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Python](api/python/reference/api/pyspark.sql.streaming.DataStreamWriter.html#pyspark.sql.streaming.DataStreamWriter)
docs)
+([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Python](api/python/reference/pyspark.ss/api/pyspark.sql.streaming.DataStreamWriter.html#pyspark.sql.streaming.DataStreamWriter)
docs)
returned through `Dataset.writeStream()`. You will have to specify one or more
of the following in this interface.
- *Details of the output sink:* Data format, location, etc.
@@ -2193,7 +2193,7 @@ Here are the details of all the sinks in Spark.
By default it's disabled.
<br/><br/>
For file-format-specific options, see the related methods in
DataFrameWriter
- (<a
href="api/scala/org/apache/spark/sql/DataFrameWriter.html">Scala</a>/<a
href="api/java/org/apache/spark/sql/DataFrameWriter.html">Java</a>/<a
href="api/python/reference/api/pyspark.sql.streaming.DataStreamWriter.html#pyspark.sql.streaming.DataStreamWriter">Python</a>/<a
+ (<a
href="api/scala/org/apache/spark/sql/DataFrameWriter.html">Scala</a>/<a
href="api/java/org/apache/spark/sql/DataFrameWriter.html">Java</a>/<a
href="api/python/reference/pyspark.ss/api/pyspark.sql.streaming.DataStreamWriter.html#pyspark.sql.streaming.DataStreamWriter">Python</a>/<a
href="api/R/write.stream.html">R</a>).
E.g. for "parquet" format options see
<code>DataFrameWriter.parquet()</code>
</td>
@@ -2736,7 +2736,7 @@ Not available in R.
</div>
</div>
-For more details, please check the docs for DataStreamReader
([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamReader.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamReader.html)/[Python](api/python/reference/api/pyspark.sql.streaming.DataStreamReader.html#pyspark.sql.streaming.DataStreamReader)
docs) and DataStreamWriter
([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamWriter.html)/
[...]
+For more details, please check the docs for DataStreamReader
([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamReader.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamReader.html)/[Python](api/python/reference/pyspark.ss/api/pyspark.sql.streaming.DataStreamReader.html#pyspark.sql.streaming.DataStreamReader)
docs) and DataStreamWriter
([Scala](api/scala/org/apache/spark/sql/streaming/DataStreamWriter.html)/[Java](api/java/org/apache/spark/sql/streaming/DataStreamWr
[...]
#### Triggers
The trigger settings of a streaming query define the timing of streaming data
processing, whether
@@ -3034,7 +3034,7 @@ lastProgress(query) # the most recent progress
update of this streaming qu
</div>
You can start any number of queries in a single SparkSession. They will all be
running concurrently sharing the cluster resources. You can use
`sparkSession.streams()` to get the `StreamingQueryManager`
-([Scala](api/scala/org/apache/spark/sql/streaming/StreamingQueryManager.html)/[Java](api/java/org/apache/spark/sql/streaming/StreamingQueryManager.html)/[Python](api/python/reference/api/pyspark.sql.streaming.StreamingQueryManager.html#pyspark.sql.streaming.StreamingQueryManager)
docs)
+([Scala](api/scala/org/apache/spark/sql/streaming/StreamingQueryManager.html)/[Java](api/java/org/apache/spark/sql/streaming/StreamingQueryManager.html)/[Python](api/python/reference/pyspark.ss/api/pyspark.sql.streaming.StreamingQueryManager.html#pyspark.sql.streaming.StreamingQueryManager)
docs)
that can be used to manage the currently active queries.
<div class="codetabs">
diff --git a/python/docs/source/getting_started/quickstart_ps.ipynb
b/python/docs/source/getting_started/quickstart_ps.ipynb
index 494e08da9ee..dc47bdfa2c6 100644
--- a/python/docs/source/getting_started/quickstart_ps.ipynb
+++ b/python/docs/source/getting_started/quickstart_ps.ipynb
@@ -14183,7 +14183,7 @@
"source": [
"### Parquet\n",
"\n",
- "Parquet is an efficient and compact file format to read and write faster.
See
[here](https://spark.apache.org/docs/latest/api/python/reference/pyspark.pandas/api/pyspark.pandas.DataFrame.to_paruqet.html)
to write a Parquet file and
[here](https://spark.apache.org/docs/latest/api/python/reference/pyspark.pandas/api/pyspark.pandas.read_parquet.html)
to read a Parquet file."
+ "Parquet is an efficient and compact file format to read and write faster.
See
[here](https://spark.apache.org/docs/latest/api/python/reference/pyspark.pandas/api/pyspark.pandas.DataFrame.to_parquet.html)
to write a Parquet file and
[here](https://spark.apache.org/docs/latest/api/python/reference/pyspark.pandas/api/pyspark.pandas.read_parquet.html)
to read a Parquet file."
]
},
{
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]